Combining Bibliometrics and Information Retrieval
Bibliometric techniques are not yet widely used to enhance retrieval processes in digital libraries, yet they offer value-added effects for users. How can we build scholarly information systems that explicitly use them at the user-system interface? We will explore how network analysis, statistical modelling, and mapping of scholarship can improve retrieval services for specific communities, as well as for large, cross-domain collections. Some of these techniques are already used in working systems; others are envisioned for the future. We will ask: how can models of science be interrelated with scholarly, task-oriented searching? And can insights from searching improve the science models themselves?
This workshop aims to raise awareness of the missing link and to create a common ground for the incorporation of science models into retrieval at the digital library interface. It will involve keynote talks, research project reports, demonstrations, and a panel discussion on next-generation services. Our interests include information retrieval, information seeking, science modelling, network analysis, and digital libraries. The goal is to apply insights from bibliometrics, scientometrics, and informetrics to concrete, practical problems of information retrieval and browsing.
GESIS – Leibniz Institute for the Social Sciences, Germany
GESIS - Leibniz Institute for the Social Sciences, Germany
GESIS - Leibniz Institute for the Social Sciences, Germany
DANS, Royal Netherlands Academy of Arts and Sciences in Amsterdam, Netherlands
Howard D. White
College of Information Science and Technology, Drexel University, USA
Topic Extraction Methods
Bibiometrics have been enriched by text analysis techniques over the past decade or so. Unstructured text offers a rich Science, Technology & Innovation (ST&I) information resource. Two main approaches are 1) Artificial Intelligence (AI) – strong when you know what you want to find; and 2) statistical tools – powerful when you are exploring broadly. This session demonstrates use of a set of statistical tools to concentrate topical content for further analyses.
We illustrate such term clumping via a case analysis (Nano-Enabled Drug Delivery). The process starts by downloading a search set of abstract records on a chosen ST&I area from R&D publication and/or patent databases (e.g., Web of Science). Using desktop software (e.g., VantagePoint, Thomson Data Analyzer), we apply Natural Language Processing (NLP) routines to extract noun phrases from titles, abstracts, claims, or other text fields. We may combine with available keyword fields to give ~100,000 or more terms. We are developing a semi-automated process that steps through: application of thesauri (removing unwanted terms & consolidating); fuzzy matching routines (to consolidate variations); “parent-child” and Association Rule macros (to combine closely related terms); acronym eliminator macro; and Term Frequency Inverse Document Frequencey (TFIDF).
We then explore two approaches to extract intelligence from the clumped topic terms & phrases: 1) inductive statistical methods (Principal Components; topic modeling); and 2) aids to deduce purposes (e.g., Subject-Action-Object and Semantic TRIZ analyses) to get at linkage between R&D and potential applications. We look forward to exchanging ideas on applying and improving these “tech mining” tools.
Bibliometric analysis for funding agencies
Research funding organizations play an important role in the development of Science and they are also interested in studying how they influence the scientific landscape. But, which data sources and which bibliometric indicators are appropriate for this purpose? Another challenge is how to collect reliable data on publications that reasonably can be linked to funded projects by these organizations. The inclusion of funding information in the bibliographic records of the Web of Science database since 2008 is bringing new opportunities and challenges for the study of research funding organizations.
The discussion of these opportunities, limitations and challenges of this new type of data is an important step in the creation of new scientometric studies targeting funding organizations.
One representative of the European Research Council n.n.
Standards for Science Mapping and Classifications
The proposed workshop will build on and extend the JSMF Workshop on Standards for Science Metrics, Classifications, and Mapping that took place August 11-12, 2011 at Indiana University in Bloomington, IN (http://scimaps.org/meeting/110810/).
It will bring together researchers and practitioners interested in the scientific development and proper usage of science classifications and science maps. Demonstration of existing approaches, tools, and techniques will provide a point of departure for a discussion of challenges and opportunities for developing scientifically sound standards for measuring and communicating the structure and dynamics of science and technology (S&T). Among others, we will discuss:
- Strengths, weaknesses, and limitations of existing classifications in bibliometric/science mapping context;
- Properties a good classification should have in a science mapping context.
- Properties a good classification should have in a metrics context. Are classification properties for science mapping and metrics compatible?
- Development new classifications at the paper level that takes into account the nature / methods / objects of disciplines (and history, etc.), but also their citation characteristics;
- Dynamically evolving classifications that can capture data from 1900-today.
- How to classify interdisciplinary papers/journals.
S&T Mapping Standards
- Strengths, weaknesses, and limitations of existing science maps;
- Properties a good science maps and existing/emergent standards that help harmonize existing academic/government/industry standards.
- Align S&T classifications/maps with other ontologies, taxonomies to support cross-walks and mapping.
Cyberinfrastructure for Network Science Center, Director, School of Library and Information Science, Indiana University, Bloomington, IN, USA / Royal Netherlands Academy of Arts and Sciences (KNAW), The Netherlands
Ohio State University, USA
Université de Montréal, Canada