AI 용어집
인공지능 완전 사전
OCR (Optical Character Recognition)
Process of converting images of printed or handwritten text into machine-readable text data. This technology enables automatic extraction of information contained in scanned documents.
Text segmentation
Technique of dividing an image into distinct regions representing lines, words, or individual characters. Segmentation is a crucial step that determines the overall accuracy of the OCR system.
Image binarization
Process of converting a grayscale or color image into a binary black and white image. This transformation improves the contrast between text and background to facilitate recognition.
Image preprocessing
Set of techniques applied to images before OCR to improve text quality and readability. Includes skew correction, noise removal, and contrast enhancement.
Neural OCR
Modern approach to OCR using deep neural networks to recognize characters with superior accuracy. This method outperforms traditional algorithms based on heuristic rules.
Text region detection
Algorithm that automatically identifies and locates regions containing text in a complex image. This step allows distinguishing text from images, tables, and other graphic elements.
Handwriting recognition
Specialized subfield of OCR dealing with the conversion of handwriting into digital text. This task presents additional challenges due to the individual variability of writing styles.
Table extraction
Automated process of identifying and converting tabular structures in documents into structured data. Requires simultaneous recognition of text and table layout.
Multilingual OCR
Ability of an OCR system to recognize and process text in multiple languages simultaneously. Requires models trained on multilingual corpora and automatic language detection.
Layout analysis
Process of understanding the structure and organization of a document, including identifying titles, paragraphs, columns, and other layout elements. Essential for maintaining the original formatting.
Character normalization
Technique for standardizing the size, orientation, and spacing of characters before recognition. This step reduces visual variability to improve recognition rates.
Spell checking
Post-OCR process using dictionaries and linguistic models to correct recognition errors. Significantly improves the final accuracy of extracted text.
Tesseract OCR
Open-source OCR engine initially developed by HP and later maintained by Google. Recognized for its versatility and support of over 100 languages with deep learning models.
Complex document processing
Capability of modern OCR systems to handle documents with sophisticated layouts, including images, tables, and multiple columns. Requires advanced structural analysis algorithms.
Document indexing
Process of extracting and organizing key information from scanned documents to enable fast and efficient searching. OCR is often the first step in this process.
Form recognition
OCR specialization focused on structured data extraction from pre-printed forms. Combines text recognition with understanding of field structure.
Hybrid OCR
An approach combining multiple OCR techniques (template-based, feature-based, and neural) to maximize recognition accuracy. Uses fusion algorithms to select the best results.
Linguistic post-processing
A set of techniques applied after initial recognition to improve text quality using language models and grammatical rules. Essential for achieving accuracy rates above 99%.