Linguateca www.linguateca.pt A Geographic Knowledge Base for Semantic Web Applications Marcirio Silveira Chaves Mário J. Silva Bruno Martins 20º Brazilian Symposium on Databases - SBBD 2005 Uberlândia - MG Motivation/Context • GKB - Geographic Knowledge Base – Geographic – Network • Information exported as ontologies • Geographic-aware Semantic Web applications • GREASE – Geographic Reasoning for Search Engines 2005-10-03 20º Brazilian Symposium on Databases 2 Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks 2005-10-03 20º Brazilian Symposium on Databases 3 Information Sources used by GKB • Geo-Administrative and Geo-Physical Domain – – – – 2005-10-03 Administrative Postal Gazetteers Wikipedia • Network Domain – FCCN • Web domains • Web sites 20º Brazilian Symposium on Databases 4 Architecture of GKB 2005-10-03 20º Brazilian Symposium on Databases 5 Feature concept in GKB • A meaningful object in the selected domain of discourse [ISO19109]. Ex.: • countries, cities and localities 2005-10-03 20º Brazilian Symposium on Databases 6 Conceptual Design of GKB • GKB meta-model 2005-10-03 20º Brazilian Symposium on Databases 7 Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks 2005-10-03 20º Brazilian Symposium on Databases 8 Knowledge Integration in GKB • GKB hierarchy from different information sources • Algorithm: – It searches the lowest common features types in both hierarchies – If it holds, it identifies the common instances between the hierarchies – Once the common instances are identified, it goes up the hierarchy and searches for the lowest common ancestor – It verifies the distance (in number of relationships partOf) between the common instances of the features types and its ancestors. The ancestor, which has the small distance up to the common instances is merged through a relationship partOf with the ancestor in the another hierarchy. The existing relationships in both hierarchies are maintained. 2005-10-03 20º Brazilian Symposium on Databases 9 Knowledge Integration in GKB • GKB hierarchy from different information sources H1 NUT2 H2 Norte Porto NUT3 MUNICIPALITY 2005-10-03 Grande Porto Matosinhos DISTRITO Tâmega Vila Nova de Gaia Matosinhos Penafiel Vila Nova de Gaia 20º Brazilian Symposium on Databases Penafiel MUNICIPALITY 10 Knowledge Integration in GKB • GKB hierarchy from different information sources H1 NUT2 H2 Norte Porto NUT3 MUNICIPALITY 2005-10-03 Grande Porto Matosinhos DISTRITO Tâmega Vila Nova de Gaia Matosinhos Penafiel Vila Nova de Gaia 20º Brazilian Symposium on Databases Penafiel MUNICIPALITY 11 Knowledge Integration in GKB • GKB hierarchy from different information sources H1 NUT2 H2 Norte Porto NUT3 MUNICIPALITY 2005-10-03 Grande Porto Matosinhos DISTRITO Tâmega Vila Nova de Gaia Matosinhos Penafiel Vila Nova de Gaia 20º Brazilian Symposium on Databases Penafiel MUNICIPALITY 12 Knowledge Integration in GKB Merged Hierarchy Norte Grande Porto Tâmega Porto Matosinhos 2005-10-03 Vila Nova de Gaia Penafiel 20º Brazilian Symposium on Databases 13 Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks 2005-10-03 20º Brazilian Symposium on Databases 14 Using Geographic Knowledge in GKB • Geographic scopes – www.cm-lisboa.pt – Lisboa (municipality) • • • • Rules New relationships and knowledge Description Logics (DLs) Geo domain – Names composed of multiple words are represented in different ways • Network domain – Names of URLs are decomposed by the correspondent domain division 2005-10-03 20º Brazilian Symposium on Databases 15 Using Geographic Knowledge in GKB • ABox in DLs for the: – municipality of Santiago do Cacém geoFeatureName(270,“santiagodocacem”) geoFeatureName(270,“santiagocacem”). geoFeatureName(270,“santiago-do-cacem”). geoFeatureName(270,“santiago-cacem”). geoFeatureType(270,“CON”). – web site: www.cm-santiago-do-cacem.pt netSiteSubDomain(33684,“www”). netSitePrefix(33684,“cm”). netSiteDomainToken(33684,“santiago-do-cacem”). netSiteTLD(33684,“pt”). 2005-10-03 20º Brazilian Symposium on Databases 16 Using Geographic Knowledge in GKB • Terminology Description (TBox in DLs) – Municipalities hasScope(idN,idG) netSiteDomainToken(idN,X) ((netSitePrefix(idN,“cm”) netSitePrefix(idN,“mun”)) geoFeatureType(idG,“CON”) geoFeatureName(idG,X). 2005-10-03 20º Brazilian Symposium on Databases 17 Using Geographic Knowledge in GKB • Ex.: hasScope(idN,idG) netSiteDomainToken(idN,X) (netSitePrefix(idN,“cm”) netSitePrefix(idN,“mun”)) geoFeatureType(idG,“CON”) geoFeatureName(idG,X). netSiteDomainToken(33684, “santiago-do-cacem”). netSitePrefix(33684, “cm”). geoFeatureType(270, “CON”). geoFeatureName(270, “santiago-do-cacem”). New knowledge: hasScope(33684, 270). 2005-10-03 20º Brazilian Symposium on Databases 18 Using Geographic Knowledge in GKB • Rule-based assigned scopes by GKB to sites of Portugal Site Type # of sites # of matches distritos 33 17 (52%) municipalities 288 261 (90%) freguesias basic schools training centers high schools 300 1955 152 124 (41%) 124 (6%) 55 (36%) 402 105 (26%) • Scopes extended to the web pages under each one of the sites of matching subdomains 2005-10-03 20º Brazilian Symposium on Databases 19 Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks 2005-10-03 20º Brazilian Symposium on Databases 20 GKB as an Ontology • Geo-Net-PT01 <gn:Geo_Feature rdf:ID="GEO_238"> <rdf:li><gn:Geo_Relationship> <gn:rel_type_id rdf:resource="#ADJ"/> <gn:geo_id>238</gn:geo_id> <gn:geo_name <gn:geo_id> xml:lang="pt">Porto</gn:geo_name> <rdf:Bag> <gn:geo_type_id rdf:resource="#CON"/> <rdf:li rdf:resource="#GEO_127"/> <gn:info_source_id rdf:resource="#INE"/> <rdf:li rdf:resource="#GEO_156"/> <gn:related_to> <rdf:li rdf:resource="#GEO_162"/> <rdf:Bag> <rdf:li rdf:resource="#GEO_331"/> <rdf:li> </rdf:Bag> <gn:Geo_Relationship> </gn:geo_id> <gn:rel_type_id rdf:resource="#PRT"/> </gn:Geo_Relationship></rdf:li> <gn:geo_id><rdf:Bag> <rdf:li rdf:resource="#GEO_130"/> </rdf:Bag> <rdf:li rdf:resource="#GEO_3967"/> </gn:related_to> <gn:population>263131</gn:population> </rdf:Bag></gn:geo_id> </gn:Geo_Feature> </gn:Geo_Relationship> </rdf:li> 2005-10-03 20º Brazilian Symposium on Databases 21 Statistics of the Ontologies Created Statistic Portugal World # of features 418,065 12,293 # of relationships 419,867 12,258 418,340 (99.83%) 12,245 (99,89%) 395 (0.09%) 2,501(20,40%) 1,132 (0.27%) 13 (0.10%) Avg. broader features per feature 1.0016 1.07 Avg. narrower features per feature 10.56 475.44 Avg. equivalent features per feature with equivalent 1.99 3.82 Avg. adjacent features per feature with adjacent 3.54 6.5 3 (0.00%) 1(0.00%) # of features without descendants 374,349 (89.54%) 12,045 (97,98%) # of features without equivalent 417,867 (99.95%) 11,819 (96,14%) # of features without adjacent 417,739 (99.92%) 12,291 (99,99%) # of part-of relationships # of equivalence relationships # of adjacency relationships # of features without ancestors 2005-10-03 20º Brazilian Symposium on Databases 22 Presentation Structure Conceptual Design of GKB Knowledge Integration Using Geographic Knowledge in GKB GKB as an Ontology Statistics of the Ontologies Created Applications using GKB Final Remarks 2005-10-03 20º Brazilian Symposium on Databases 23 Applications using GKB • NERC tool for recognizing geographical references in text • Classification tool for assigning documents to a corresponding geographical scope • Information retrieval interface for geographical queries 2005-10-03 20º Brazilian Symposium on Databases 24 Applications using GKB 2005-10-03 20º Brazilian Symposium on Databases 25 Final Remarks • A domain-independent model for storing geographic and network knowledge • Sharing of the collected knowledge as formal ontologies • Geo-Net-PT01: The first public geographic ontology of Portugal - http://xldb.fc.ul.pt/geonetpt • Future work – Augmenting the knowledge in GKB with geographic entities extracted from the texts of the Portuguese Web 2005-10-03 20º Brazilian Symposium on Databases 26