{"id":968918,"date":"2025-10-22T11:16:09","date_gmt":"2025-10-22T05:46:09","guid":{"rendered":"https:\/\/telecomlive.in\/web\/?p=968918"},"modified":"2025-10-22T11:16:09","modified_gmt":"2025-10-22T05:46:09","slug":"deepseeks-new-tool-can-extract-text-from-photos-of-pages-what-it-means-for-users","status":"publish","type":"post","link":"https:\/\/telecomlive.in\/web\/2025\/10\/22\/deepseeks-new-tool-can-extract-text-from-photos-of-pages-what-it-means-for-users\/","title":{"rendered":"Deepseek&#8217;s new tool can extract text from photos of pages: What it means for users"},"content":{"rendered":"<p>Chinese AI startup DeepSeek has released a tool called DeepSeek OCR. This new open-source tool is designed to extract text from image files of pages with high efficiency. The project converts complex papers into a format that AI models can process quickly, with minimal memory or power consumption. The tool is designed to meet the high demand for large language models (LLMs) that must process both visual and textual data. DeepSeek OCR can process more than 200,000 pages of data per day on a single NVIDIA A100 GPU. Scaling this to a small cluster could allow training sets for larger AI models to accumulate overnight. Developers can download DeepSeek OCR from GitHub or Hugging Face and integrate it into their own applications.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Chinese AI startup DeepSeek has released a tool called DeepSeek OCR. This new open-source tool is designed to extract text from image files of pages with high efficiency. The project converts complex papers into a format that AI models can process quickly, with minimal memory or power consumption. The tool is designed to meet the high demand for large language models (LLMs) that must process both visual and textual data. DeepSeek OCR can process more than 200,000 pages of data per day on a single NVIDIA A100 GPU. Scaling this to a small cluster could allow training sets for larger [&hellip;]<\/p>\n","protected":false},"author":11,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[87,4,11],"tags":[],"class_list":["post-968918","post","type-post","status-publish","format-standard","hentry","category-it-2-the-times-of-india","category-newspapers","category-the-times-of-india"],"acf":[],"_links":{"self":[{"href":"https:\/\/telecomlive.in\/web\/wp-json\/wp\/v2\/posts\/968918","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/telecomlive.in\/web\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/telecomlive.in\/web\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/telecomlive.in\/web\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"https:\/\/telecomlive.in\/web\/wp-json\/wp\/v2\/comments?post=968918"}],"version-history":[{"count":0,"href":"https:\/\/telecomlive.in\/web\/wp-json\/wp\/v2\/posts\/968918\/revisions"}],"wp:attachment":[{"href":"https:\/\/telecomlive.in\/web\/wp-json\/wp\/v2\/media?parent=968918"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/telecomlive.in\/web\/wp-json\/wp\/v2\/categories?post=968918"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/telecomlive.in\/web\/wp-json\/wp\/v2\/tags?post=968918"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}