Linguateca
www.linguateca.pt
A Geographic Knowledge Base
for Semantic Web Applications
Marcirio Silveira Chaves
Mário J. Silva
Bruno Martins
20º Brazilian Symposium on Databases - SBBD 2005
Uberlândia - MG
Motivation/Context
• GKB - Geographic Knowledge Base
– Geographic
– Network
• Information exported as ontologies
• Geographic-aware Semantic Web
applications
• GREASE – Geographic Reasoning for
Search Engines
2005-10-03
20º Brazilian Symposium on Databases
2
Presentation Structure
Conceptual Design of GKB
Knowledge Integration
Using Geographic Knowledge in GKB
GKB as an Ontology
Statistics of the Ontologies Created
Applications using GKB
Final Remarks
2005-10-03
20º Brazilian Symposium on Databases
3
Information Sources used by GKB
• Geo-Administrative
and Geo-Physical
Domain
–
–
–
–
2005-10-03
Administrative
Postal
Gazetteers
Wikipedia
• Network Domain
– FCCN
• Web domains
• Web sites
20º Brazilian Symposium on Databases
4
Architecture of GKB
2005-10-03
20º Brazilian Symposium on Databases
5
Feature concept in GKB
• A meaningful object in the selected
domain of discourse [ISO19109].
Ex.:
• countries, cities and localities
2005-10-03
20º Brazilian Symposium on Databases
6
Conceptual Design of GKB
• GKB meta-model
2005-10-03
20º Brazilian Symposium on Databases
7
Presentation Structure
Conceptual Design of GKB
Knowledge Integration
Using Geographic Knowledge in GKB
GKB as an Ontology
Statistics of the Ontologies Created
Applications using GKB
Final Remarks
2005-10-03
20º Brazilian Symposium on Databases
8
Knowledge Integration in GKB
• GKB hierarchy from different information sources
• Algorithm:
– It searches the lowest common features types in both
hierarchies
– If it holds, it identifies the common instances between the
hierarchies
– Once the common instances are identified, it goes up the
hierarchy and searches for the lowest common ancestor
– It verifies the distance (in number of relationships partOf)
between the common instances of the features types and its
ancestors. The ancestor, which has the small distance up to the
common instances is merged through a relationship partOf with
the ancestor in the another hierarchy.
The existing relationships in both hierarchies are maintained.
2005-10-03
20º Brazilian Symposium on Databases
9
Knowledge Integration in GKB
• GKB hierarchy from different information sources
H1
NUT2
H2
Norte
Porto
NUT3
MUNICIPALITY
2005-10-03
Grande
Porto
Matosinhos
DISTRITO
Tâmega
Vila
Nova
de Gaia
Matosinhos
Penafiel
Vila
Nova
de Gaia
20º Brazilian Symposium on Databases
Penafiel
MUNICIPALITY
10
Knowledge Integration in GKB
• GKB hierarchy from different information sources
H1
NUT2
H2
Norte
Porto
NUT3
MUNICIPALITY
2005-10-03
Grande
Porto
Matosinhos
DISTRITO
Tâmega
Vila
Nova
de Gaia
Matosinhos
Penafiel
Vila
Nova
de Gaia
20º Brazilian Symposium on Databases
Penafiel
MUNICIPALITY
11
Knowledge Integration in GKB
• GKB hierarchy from different information sources
H1
NUT2
H2
Norte
Porto
NUT3
MUNICIPALITY
2005-10-03
Grande
Porto
Matosinhos
DISTRITO
Tâmega
Vila
Nova
de Gaia
Matosinhos
Penafiel
Vila
Nova
de Gaia
20º Brazilian Symposium on Databases
Penafiel
MUNICIPALITY
12
Knowledge Integration in GKB
Merged Hierarchy
Norte
Grande
Porto
Tâmega
Porto
Matosinhos
2005-10-03
Vila
Nova
de Gaia
Penafiel
20º Brazilian Symposium on Databases
13
Presentation Structure
Conceptual Design of GKB
Knowledge Integration
Using Geographic Knowledge in GKB
GKB as an Ontology
Statistics of the Ontologies Created
Applications using GKB
Final Remarks
2005-10-03
20º Brazilian Symposium on Databases
14
Using Geographic Knowledge in GKB
• Geographic scopes
– www.cm-lisboa.pt
– Lisboa (municipality)
•
•
•
•
Rules
New relationships and knowledge
Description Logics (DLs)
Geo domain
– Names composed of multiple words are represented in different
ways
• Network domain
– Names of URLs are decomposed by the correspondent domain
division
2005-10-03
20º Brazilian Symposium on Databases
15
Using Geographic Knowledge in GKB
• ABox in DLs for the:
– municipality of Santiago do Cacém
geoFeatureName(270,“santiagodocacem”)
geoFeatureName(270,“santiagocacem”).
geoFeatureName(270,“santiago-do-cacem”).
geoFeatureName(270,“santiago-cacem”).
geoFeatureType(270,“CON”).
– web site: www.cm-santiago-do-cacem.pt
netSiteSubDomain(33684,“www”).
netSitePrefix(33684,“cm”).
netSiteDomainToken(33684,“santiago-do-cacem”).
netSiteTLD(33684,“pt”).
2005-10-03
20º Brazilian Symposium on Databases
16
Using Geographic Knowledge in GKB
• Terminology Description (TBox in DLs)
– Municipalities
hasScope(idN,idG) 
netSiteDomainToken(idN,X) 
((netSitePrefix(idN,“cm”)  netSitePrefix(idN,“mun”)) 
geoFeatureType(idG,“CON”) 
geoFeatureName(idG,X).
2005-10-03
20º Brazilian Symposium on Databases
17
Using Geographic Knowledge in GKB
• Ex.:
hasScope(idN,idG) 
netSiteDomainToken(idN,X) 
(netSitePrefix(idN,“cm”)  netSitePrefix(idN,“mun”)) 
geoFeatureType(idG,“CON”) 
geoFeatureName(idG,X).
netSiteDomainToken(33684, “santiago-do-cacem”).
netSitePrefix(33684, “cm”).
geoFeatureType(270, “CON”).
geoFeatureName(270, “santiago-do-cacem”).
New knowledge: hasScope(33684, 270).
2005-10-03
20º Brazilian Symposium on Databases
18
Using Geographic Knowledge in GKB
• Rule-based assigned scopes by GKB to sites
of Portugal
Site Type
# of sites # of matches
distritos
33
17 (52%)
municipalities
288
261 (90%)
freguesias
basic schools
training centers
high schools
300
1955
152
124 (41%)
124 (6%)
55 (36%)
402
105 (26%)
• Scopes extended to the web pages under each one of
the sites of matching subdomains
2005-10-03
20º Brazilian Symposium on Databases
19
Presentation Structure
Conceptual Design of GKB
Knowledge Integration
Using Geographic Knowledge in GKB
GKB as an Ontology
Statistics of the Ontologies Created
Applications using GKB
Final Remarks
2005-10-03
20º Brazilian Symposium on Databases
20
GKB as an Ontology
• Geo-Net-PT01
<gn:Geo_Feature rdf:ID="GEO_238">
<rdf:li><gn:Geo_Relationship>
<gn:rel_type_id rdf:resource="#ADJ"/>
<gn:geo_id>238</gn:geo_id>
<gn:geo_name
<gn:geo_id>
xml:lang="pt">Porto</gn:geo_name>
<rdf:Bag>
<gn:geo_type_id rdf:resource="#CON"/>
<rdf:li rdf:resource="#GEO_127"/>
<gn:info_source_id rdf:resource="#INE"/>
<rdf:li rdf:resource="#GEO_156"/>
<gn:related_to>
<rdf:li rdf:resource="#GEO_162"/>
<rdf:Bag>
<rdf:li rdf:resource="#GEO_331"/>
<rdf:li>
</rdf:Bag>
<gn:Geo_Relationship>
</gn:geo_id>
<gn:rel_type_id rdf:resource="#PRT"/>
</gn:Geo_Relationship></rdf:li>
<gn:geo_id><rdf:Bag>
<rdf:li rdf:resource="#GEO_130"/> </rdf:Bag>
<rdf:li rdf:resource="#GEO_3967"/> </gn:related_to>
<gn:population>263131</gn:population>
</rdf:Bag></gn:geo_id>
</gn:Geo_Feature>
</gn:Geo_Relationship>
</rdf:li>
2005-10-03
20º Brazilian Symposium on Databases
21
Statistics of the Ontologies Created
Statistic
Portugal
World
# of features
418,065
12,293
# of relationships
419,867
12,258
418,340 (99.83%)
12,245 (99,89%)
395 (0.09%)
2,501(20,40%)
1,132 (0.27%)
13 (0.10%)
Avg. broader features per feature
1.0016
1.07
Avg. narrower features per feature
10.56
475.44
Avg. equivalent features per feature with equivalent
1.99
3.82
Avg. adjacent features per feature with adjacent
3.54
6.5
3 (0.00%)
1(0.00%)
# of features without descendants
374,349 (89.54%)
12,045 (97,98%)
# of features without equivalent
417,867 (99.95%)
11,819 (96,14%)
# of features without adjacent
417,739 (99.92%)
12,291 (99,99%)
# of part-of relationships
# of equivalence relationships
# of adjacency relationships
# of features without ancestors
2005-10-03
20º Brazilian Symposium on Databases
22
Presentation Structure
Conceptual Design of GKB
Knowledge Integration
Using Geographic Knowledge in GKB
GKB as an Ontology
Statistics of the Ontologies Created
Applications using GKB
Final Remarks
2005-10-03
20º Brazilian Symposium on Databases
23
Applications using GKB
• NERC tool for recognizing geographical
references in text
• Classification tool for assigning
documents to a corresponding
geographical scope
• Information retrieval interface for
geographical queries
2005-10-03
20º Brazilian Symposium on Databases
24
Applications using GKB
2005-10-03
20º Brazilian Symposium on Databases
25
Final Remarks
• A domain-independent model for storing geographic and
network knowledge
• Sharing of the collected knowledge as formal ontologies
• Geo-Net-PT01: The first public geographic ontology of
Portugal - http://xldb.fc.ul.pt/geonetpt
• Future work
– Augmenting the knowledge in GKB with geographic
entities extracted from the texts of the Portuguese Web
2005-10-03
20º Brazilian Symposium on Databases
26
Download

Using Geographic Knowledge in GKB