Mário J. Silva
IST/INESC-ID, Portugal
REACTION
REACTION Workshop
2013.07.31
Overview
Porto, FEUP
•  11:30 Welcome + Quick progress report and status
summary
•  11:45 Task leaders summarize ongoing activities
(10 min each max)
•  13:00 Break.
•  14:30 Technical Presentations
•  15:30 Break
•  16:00 Short Technical Presentations / Demos
•  18:00 Directions for next meeting/Next workshop
•  18:15 Meeting ends.
REACTION
Agenda
•  Luís Rei (on Verbetes)
•  André Lopes (on Matching
POWER,Wikipedia, …)
REACTION
Technical Presentations
•  Jorge Teixeira (mundo numa
rede/sapo stuff)
•  Arian Pasquali (twitterecho
update)
•  Gustavo Laboreiro (twitterecho
pre-processing modules)
•  Tiago Cunha (spatial-temporal
data mining in Twitter data)
•  Jorge Moreira (information
extraction from social media foruns, blogs)
•  João Santos (entity linking
Público 10 anos)
•  João Oliveira (ngrams Público
10 anos)
•  David Batista (network
analysis Público 10 anos)
•  Raquel Albuquerque (Público
Newsroom)
•  Rui Silva (POS Tagger)
•  David Forte (Geo Referencing
Portuguese text)
REACTION
Short Technical Presentations
(5-10 min each)
•  Computational journalism, aka
database journalism
–  Intensive use of software tools for news
research, production and presentation
•  What is the
impact in the routines of newsrooms?
•  What effect will these tools have on the
quality of news and the
productivity of journalists?
REACTION
The Problem...
1.  Automatic content analysis
(documents, news, blogs, micro-blogs, comments)
2.  Automatic analysis of
explicit and implicit social networks
3.  Design of
rich visualization and interaction interfaces
4.  Case-study evaluation of developed computational
journalism methodology in a production setting.
Critical analysis of practical impact on newsroom
quality, efficiency, and economics.
REACTION
Challenges
REACTION
Partnership
•  LASIGE, FCUL  INESC-ID, IST
(Mário J. Silva, Paula Carvalho and Francisco
Couto from FCUL)
•  LIACC, FEUP
(Eugénio de Oliveira, Eduarda M. Rodrigues, Luís
Sarmento)
•  CIMJ, FCH/UNL (António Granado)
•  Austin: School of Information and Computer
Science at Austin
(Luis Francisco-Revilla,
Matthew Lease)
•  PT Comunicações, SAPO
(Benjamim Júnior, Celso Martinho, Luís Sarmento)
•  Público (Sérgio B. Gomes)
•  Inesc-id:
–  David Batista, Silvio Moreira
–  Diogo Figueiredo, João Ramalho,
João Oliveira, João Santos
–  Rui Silva, David Forte
•  UP:
–  Matko Bosnjak, Arian Pasquali, Gustavo Laboreiro,
Andrija Cajic, Nuno Baldaia, Tiago Cunha, Jorge
Moreira
–  Jorge Teixeira, Luís Rei (SAPO)
•  UT Austin: Hohyon Ryu, Steven Fazzio
•  UNL: Raquel Albuquerque
REACTION
Students
1.  Information Mining
2.  Information Discovery
3.  Web Community Sensing
4.  Tracking Information Flow
5.  Interaction and Personalization
6.  Query and Visualization
7.  Computational Newsroom
REACTION
Research tasks
1. 
2. 
3. 
4. 
5. 
6. 
7. 
Information Mining
Information Discovery
Web Community Sensing
Tracking Information Flow
Interaction & Personalization
Query and Visualization
Computational Newsroom
• 
• 
• 
• 
• 
• 
• 
REACTION
Research tasks - Leaders
Paula Cravalho
Bruno Martins (was Francisco)
Eduarda Rodrigues
Francisco Couto (was Matt)
Mário J. Silva (was Revilla)
Eduarda Rodrigues (Sarmento)
António Granado (Mário covers)
•  Development of robust linguistic resources to
process different types and genres of texts
–  knowledge resources about media personalities:
recognizing and resolving references to namedentities;
–  sentiment lexicons and grammars: detecting the
polarity of opinions about relevant personalities
–  annotated corpora: training different text classifiers
and evaluating classification procedures
REACTION
Information Mining
Relationship extraction techniques to support
information discovery in journalists’
activities
•  Entity Ranking: finding the relevant entities
for a given topic
•  Entity Distillation: finding relevant
resources for a given entity
•  Attribute Selection: finding a list of key
aspects to compare and differentiate a
given set of entities
REACTION
Information Discovery
REACTION
Web Community Sensing
•  Modeling the credibility and authority of news
sources and opinion makers in social
networks
•  Identifying influential individuals and
experts on a given news topic
•  Monitoring the community reaction to news
stories and the polarity of opinions
REACTION
Tracking Information Flow
•  Identifying originating source of new
ideas and information
•  Understand evolutionary development of
ideas through their iterative retelling and
revision over time and across sources
–  detecting cases and patterns of re-use (e.g.
via “memes” or larger units of similar text) and
information flow for source identification and
novelty detection.
REACTION
Interaction and Personalization
•  Determining which interaction and
personalization mechanisms are best suited to:
–  Significantly enhance the user experience
–  Provide the news site with useful, tacit
feedback about its readers’ needs
•  Investigating interactive news interfaces that
support both automatic and manual
personalization for readers
•  Development of tools for querying
extracted information and visualizing
annotated documents and datasets
•  Continuous scanning of the social web,
news sources and various kinds of data
streams
–  Sapo already scans and processes many of
these streams, in particular the news media
REACTION
Query and Visualization
•  Environment where the new tools and
resources developed in the project,
together with other software will be
accessible
•  Will use tools and collect data for case
studies to be evaluated
–  observation and structured interviewing of the
journalists in contact with the developed tools.
•  The research will try to contextualize the
changing nature of media work
REACTION
Computational Newsroom
•  Started October 1st 2010, 3 years
•  http://dmir.inesc-id.pt/reaction/
•  1st milestone: End of Month 6
–  REACTION Specification
–  First toolset prototype
(should have demoed it at the 2011 Collaboratory)
•  2nd milestone: End of Month 36
–  Demonstrable Computational Newsroom
REACTION
More details
Download

slideshow presentation - Inesc-ID