Information extraction in the Web era
Autoři
Více o knize
The number of research topics covered in recent approaches to Information - traction (IE) is continually growing as new facts are being considered. In fact, while the user’s interest in extracting information from texts deals mainly with the success of the entire process of locating, in document collections, facts of interest, the process itself is dependent on several constraints (e. g. the domain, the collection dimension and location, and the document type) and currently it tackles composite scenarios, including free texts, semi- and structured texts such as Web pages, e-mails, etc. The handling of all these factors is tightly related to the continued evolution of the underlying technologies. In the last few years, in real-world applications we have seen the need for scalable, adaptable IE systems (see M. T. Pazienza, “InformationExtraction: Towards Scalable Adaptable Systems”, LNAI 1714) to limit the need for human intervention in the customization process and portability of the IE application to new domains. Scalability and adaptability requirements are still valid impacting features and get more relevance into a Web scenario, where in intelligent information agents are expected to automatically gather information from heterogeneous sources.