Pontifícia Universidade Católica de Minas Gerais ARPPA: Mining Professional Profiles from LinkedIn Using Association Rules Authors: Paula Raissa Costa e Silva Wladmir Cardoso Brandão February 24, 2015 MOTIVATION AND PROBLEM l l l Social network proliferation Professional information volume Professional profile extraction and analysis Knowledge discovery on professional profiles extracted from Web OBJECTIVES l Introduce the ARPPA’s approach: l l l l l ARPPA: Association Rules for Professional Profile Analysis Retrieve relevant informations on professional profiles Recognize mutual implications among professional events Professional profiles characterization. Impact: l l Companies: Plan employees careers Universities: Plan, guide and implement academic activies and the courses curriculum RELATIONAL WORKS l l l Russell, 2013: Crawler and data mining from social network. Pizzato & Bhasin, 2013: Data mining, web search engine and social network analysis from LinkedIn for a people recommender system. Xu, Li, Gupta, Bugdayci & Bhasin, 2014: SimCareers framework for model similarities among professional profiles. ARPPA APPROACH EXPERIMENTATION l Graduated students from PUC Minas’ IT courses. l l l l Source: LinkedIn Period: 2004 to 2013 Professionals: 1847 Resumes: 398 RESULTS RESULTS RESULTS RESULTS RESULTS Association Rules l l l l Systems Analyst Career on Information Technology and Services area Minimum confidence: 0.87 Graduated students at 2010 with Senior level on Information Technology and Services Minimum confidence: 0.95 Professionals specialized at SAP on Information Technology and Services area Minimum confidence: 1.0 Professionals from Information Technology and Services area working in Belo Horizonte Minimum confidence: 0.97 CONCLUSION AND FUTURE WORK l l l l ARPPA: Efective for professional profiles characterization Multidimensinal data model suitable for professional profiles analysis Simple approach based on association rules Future works: l l l l Crawler otimization Different data mining algorithms Enrich the multidimensional data model Extend to other courses Thanks!