linux.conf.au 2021 | Presentation: Open Journal Matcher: open journal discovery for everyone

Presented by

Mark Eaton
https://projects.ocert.at

Mark Eaton is a Reader Services Librarian and Associate Professor at Kingsborough Community College, at the City University of New York. He holds at advanced degrees from the University of Toronto and Queen’s University (both in Canada). Having recently been appointed web librarian for his library, Mark has a lot of plans for improving the library’s public face on the web. Mark is committed advocate for code education for everyone, and has led programming workshops for librarians both at CUNY and at national conferences. He enjoys working to support his students, colleagues and community. He believes that when librarians build open, appropriate, and privacy-respecting technologies for their communities, it can be empowering for both librarians and their libraries.

Abstract

I recently built a web application called the Open Journal Matcher (https://ojm.ocert.at), which is a recommender tool for academics looking to find a suitable scholarly journal for their work. The Open Journal Matcher allows users to paste in a draft abstract, which it then compares with the abstracts of over 5600 journals from the Directory of Open Access Journals. The application then returns the top five suggested matches, which are meant to be both relevant and serendipitous. This can be very useful to anyone trying to find an appropriate journal for their work. While there are other journal matching services available, to my knowledge this is the only one that is both fully interdisciplinary and fully open source. The code for the matcher application, the code for the matching algorithm, and the content of the journals, is all openly licensed. Upon its release in June 2020, the OJM received a very favorable reception from the open scholarship and scholarly communications communities. It has been shared widely by on many platforms. For me, this has reaffirmed the need for such a tool, and has led me to focus on its further development. This presentation will describe insights gleaned while building this tool. I’ll describe the challenges of gathering journal data from the Directory of Open Access Journals’ API at scale; the numerous lessons learned while using natural language processing tools to calculate the similarity of texts; and the difficulties processing large amounts of data very quickly for the web using Google Cloud Platform and asynchronous Python programming. This project also raises important questions about the ethics of algorithmic decision making. As technologists, how do we critically evaluate the algorithms that we use when we solve practical programming problems? How do we communicate the trade-offs and choices we make to stakeholders in our communities who rely upon the tools we build? And how do we ensure that feedback from our communities always comes first in shaping those tools, so that they most help the people we are serving? This presentation will be of interest to authors who are looking for a journal to publish their scholarly work. It will also be relevant to technologists who are interested in building open source tools for their communities. It will be helpful to librarians who promote scholarly communications, and who may find this tool to be a useful addition to their toolkit. Lastly, it will serve as an interesting example of the novel services we can provide to our communities when we apply open digital technologies in support of our scholarship.