Glossario IA
Il dizionario completo dell'Intelligenza Artificiale
Pattern Regex
Regular expression used to describe a search pattern in a character string, essential for extracting entities based on specific text formats.
Linguistic Rule
Principle derived from the grammar or syntax of a language, applied to constrain or guide entity identification in a rule-based NER system.
Gazetteer
Reference list or dictionary of proper names (e.g., cities, first names) used by NER systems to validate or recognize entities through simple text search.
Window-Based Rule
Type of rule that examines a token and its immediate context (window of words) to decide if it constitutes an entity, based on specific words or labels.
Nominal Ambiguity
Phenomenon where the same term can refer to different types of entities (e.g., 'Paris' as a city or person), posing a challenge for rule-based NER systems.
Left/Right Context Rule
Rule that identifies an entity based on specific words or patterns appearing immediately before (left context) or after (right context) the candidate.
Text Normalization
Preprocessing that cleans and standardizes text (e.g., removing punctuation, lowercasing) to improve the effectiveness of regex patterns and linguistic rules.
Capitalization Rule
Heuristic rule that exploits capital letters to identify potential entities, such as proper nouns or sentence beginnings.
Pattern Expression
Formalization of a search rule, often more complex than a simple regex, which can include constraints on grammatical labels or sentence structure.
Disambiguation
Process of resolving ambiguity to determine the correct entity type when a candidate can belong to several, often through hierarchical rules.
Exclusion Rule
Rule specifying conditions that, if met, prevent a text segment from being labeled as an entity, thereby reducing false positives.