Scanned PDFs are image files at heart, the text on each page is rendered as pixels rather than as machine-readable characters, which is why your cursor cannot select a sentence and your copy command pulls back nothing.
Loading PDF to Word…
OCR extracts text from scanned pages
Multi-page scanned PDFs supported
Editable Word output
Free, no Adobe required
Drop the PDF to Word into any page — blog post, product docs, intranet, school portal — with a single line of HTML. Your visitors get the full tool, processed entirely in their browser. No backend, no uploads, no signup.
Embed code
<iframe
src="https://www.fixtools.io/pdf/pdf-to-word?embed=1"
width="100%"
height="780"
frameborder="0"
style="border:0;border-radius:16px;max-width:900px;"
title="PDF to Word by FixTools"
loading="lazy"
allow="clipboard-write"
></iframe>Attribution-friendly: a small "Powered by FixTools" link appears in the embed footer.
When a document is scanned, the scanner captures a raster image of the page, typically stored as a JPEG or TIFF stream embedded inside the PDF container. There is no machine-readable text in the file, only pixels arranged in patterns that human eyes interpret as letters. Optical Character Recognition works by analysing those pixels, identifying shapes that match known glyph templates, and converting them to Unicode text plus positional metadata. The process involves several distinct stages: deskewing the image to correct for the slight rotation introduced by feeding a page through a scanner, binarising the image to high-contrast black and white so character edges become unambiguous, segmenting the page into text regions and non-text regions like photographs or logos, and then applying a character classifier to each text segment. Modern OCR engines like Tesseract running in WebAssembly can process a single A4 page in one to three seconds on a typical laptop. For a 20-page scanned document, expect 20 to 60 seconds of processing time entirely inside your browser tab.
Scan resolution is the single most important factor in OCR accuracy, far more important than the OCR engine or the language pack chosen. A scan at 150 DPI, dots per inch, produces roughly 1240 by 1754 pixels for an A4 page, which gives the OCR classifier very limited character detail and routinely confuses similar shapes such as e and c or a and o. At 300 DPI that becomes 2480 by 3508 pixels, which is the recognised standard minimum for reliable character recognition on body-size fonts. At 600 DPI, accuracy improves further for small footnote fonts and fine details such as accented characters in European languages. Most office multifunction devices default to 200 or 300 DPI. If you are rescanning a document specifically to run through OCR, set your scanner to 300 DPI for standard ten to twelve point text and 400 DPI for documents with eight-point or smaller fonts. TIFF format preserves more detail than JPEG for the source scan because TIFF does not introduce compression artefacts.
For printed, typed documents scanned at 300 DPI or above with good contrast, expect OCR accuracy of 98 to 99 percent for standard Latin characters in common business fonts. A 500-word page might have two to five recognition errors on average, almost all of which are obvious misreadings you can fix in seconds. Common error patterns include 0 misread as O, 1 misread as l or I, the rn pair misread as m, and the cl pair misread as d. Numbers in tables are slightly more error-prone than continuous prose because the classifier has fewer surrounding-context cues to break ties. Handwritten text is fundamentally different: handwriting recognition requires dedicated neural models trained on handwritten samples, and standard OCR engines produce poor results on handwritten pages regardless of resolution. If your scanned document contains handwriting, plan to correct the text manually after conversion or treat the OCR output as a starting outline.
Beyond accuracy, the structure of the resulting Word document depends on how cleanly the OCR engine can identify columns, paragraphs, and tables on the page. Single-column reports with consistent margins convert with clean paragraph breaks and matched indentation. Two-column academic papers usually need a small amount of reordering because the engine occasionally reads across both columns on a single line. Tables built with visible borders are detected fairly well, while tables that rely on whitespace alignment without rules sometimes collapse into a flat sequence of paragraphs that you have to rebuild with Insert Table in Word. Knowing the layout of your source helps you predict the cleanup time, and a quick pre-scan to crop margins and check page orientation pays off in measurably better output.
Upload your scanned PDF. FixTools runs OCR on the image pages and produces an editable Word document with the recognised text. Best results come from clear, high-resolution scans.
Step-by-step guide to convert a scanned pdf to word:
Upload your scanned PDF
Open the PDF to Word tool and drag your scanned PDF onto the upload area, or click to browse. Scanned files are often large because each page carries an embedded raster image, so allow a few seconds for the file to load fully into browser memory before the convert button becomes active. A 50-page colour scan can be 30 MB or more, which is normal and not a sign of trouble.
OCR processing
FixTools automatically detects that the pages are image-based and routes them through the embedded OCR engine, which runs as WebAssembly inside your browser. The engine deskews, binarises, segments, and classifies each page in turn, reporting progress as it goes. No image data is uploaded to any server, every recognition step happens on your own machine.
Review the output
The converted Word document contains the recognised text along with detected paragraph breaks and table structures. Review it for any OCR errors using Word's spell check as a quick first pass, since most genuine OCR misreads also produce spelling flags. Accuracy depends almost entirely on the scan quality of the original source PDF.
Download and edit
Click Download to save the .docx file to your downloads folder. Open it in your preferred word processor, run a final read-through to catch the small handful of typical OCR misreads, then edit, reformat, or send it on as you would any other Word document. The whole round trip for a clean 20-page scan typically takes about three minutes.
Common situations where this approach makes a real difference:
Law firm digitising paper case files
A small law firm scans 200-page paper case files at 300 DPI to create archival PDFs for long-term storage and easier remote access. Converting these PDFs to Word lets paralegals search across the full text of every case, copy clauses and quotes into new filings, and reference exhibits without trips to the filing cabinet. OCR accuracy on standard legal typewritten text at 300 DPI consistently exceeds 98 percent, keeping manual correction time under ten minutes per file even on dense pleadings, and producing files clean enough for full-text indexing in the firm's document management system.
Academic researcher transcribing archival documents
A history researcher has 1970s typewritten interview transcripts scanned at 300 DPI from a university archive holding. Converting the scanned PDFs to Word provides a working draft that captures roughly 97 percent of the text accurately, including the slightly faded carbon-copy pages. The researcher then reviews the output against the original scan side by side, correcting the remaining errors in under twenty minutes per 30-page document, and ends up with searchable transcripts that can be quoted, coded with qualitative analysis software, or shared with collaborators.
Business owner recovering records from old paper files
A small business owner needs to digitise five years of paper invoices that were scanned to PDF at 200 DPI by a previous bookkeeper. FixTools OCR extracts the vendor names, invoice amounts, dates, and reference numbers into a Word document with each invoice on its own page. The owner then copies the structured data into a spreadsheet for expense tracking, VAT reclaim, and historical analysis, avoiding complete manual reentry of several hundred line items and reducing what would have been a week of typing to an afternoon of review.
Student converting a photocopied textbook chapter
A student photocopied a chapter from a borrowed library textbook on a flatbed scanner at 300 DPI to support a research essay. Converting the scanned PDF to Word gives them searchable, editable text they can paste into their research notes, with the chapter's subheadings preserved as separate paragraph blocks. Footnotes and running headers on each page are captured as small text fragments that the student cleans up in under five minutes, leaving a tidy reference document with every quotable passage at their fingertips.
Get better results with these expert suggestions:
Scan at 300 DPI minimum for reliable OCR
If you are scanning a paper document specifically to convert it to Word later, use 300 DPI as your minimum scanner setting and do not let your scanner default to a lower draft mode. Documents scanned at 150 DPI produce noticeably more OCR errors on standard-sized text because the classifier has too few pixels per character to disambiguate similar shapes. Set your scanner to grayscale rather than colour for pure OCR scans, which dramatically reduces file size and improves the binary contrast that the OCR engine actually uses internally.
Use Word Find and Replace to catch common OCR errors
After conversion, run Find and Replace searches for the most common OCR substitutions: a numeric zero replacing the letter O in proper nouns, a numeric one replacing a lower-case l inside words, the pair r and n appearing as the single letter m, and the pair c and l appearing as a single d. A ten-minute pass through these common patterns cleans up most of the residual mistakes in a typical 20-page scanned document and leaves the file ready for serious editing or sharing without further proofreading.
Check the scanned PDF is not already text-based
Before converting a scanned PDF through OCR, try selecting a line of text in your PDF viewer first. If a blue selection box wraps around the words and you can copy them to the clipboard normally, the PDF already contains an embedded text layer, perhaps added by the scanner's built-in OCR feature or by the document's original author, and you do not need to run OCR again. Converting it as text rather than as an image produces a cleaner result, completes faster, and avoids the small accuracy loss every OCR pass introduces.
Crop blank margins before scanning to reduce file size
Large white margins around scanned content increase file size without adding any useful data for the OCR engine to work with. If your scanner software supports it, enable automatic margin cropping or set a custom scan area that hugs the printed region. A 10 MB scanned PDF carrying generous letterhead margins can shrink to about 4 MB after cropping, which speeds up the OCR conversion noticeably and reduces browser memory pressure. Tighter scans also make page-by-page review easier when you compare the output against the source.
More use-case guides for the same tool:
Other tools you might find useful:
Open the full PDF to Word — free, no account needed, works on any device.
Open PDF to Word →Free · No account needed · Works on any device