Literature Review with NaimAI, open sourced !

3 min readDec 8, 2022

Intro

I’ve developed NaimAI to help PhDs (and scientists in general) with their literature review. I’ve detailed the very first versions of the algorithm in previous articles (here and here). In this article, I want to share with you the main features of the current version. I’ll explain first the search feature, the review feature then the items I’d like to improve with the open source community (that seem useful but could not develop it myself due to time constraints).

And by the way, NaimAI is open sourced. :)

Search feature

About 10 millions open access abstracts are used in this version. NaimAI structures automatically any abstract into three sections : Objectives, Methods and Results.

To search in NaimAI, you starts by choosing one of the 10 fields, then search using key words. Some search operators can be used (as explained on the website). Once the user search, the results are extracted from each abstract and showed in their category (objectives, methods or results of the paper). If you want to access the structured abstract, you can click on the card result.

You can also search in your own PDF articles. In which case, the same pipeline applied on the 10 millions papers is applied to your PDFs when uploaded under the ‘Custom’ tab. Then you can search in your recently uploaded papers.

Review feature

After the search results, the user can either review all the results (by clicking on Generate a review) or can select just some wanted papers. What happens here is that the objective sentence of the paper is reformulated into reported Speech (X et al. 2022 showed that..) for each paper. Besides the list of references is generated. The total review text can be then exported into Word format.

Note here that you can review your own papers too (in Custom tab) once you upload your PDF papers.

Open source project

I’ve open sourced NaimAI and the models are available on huggingface. This way, we can all improve the algorithms for the scientist community, kinf of “literature research by scientists to scientists” stuff!

There are some colab examples so you can process your own papers, search in them and even review them using NaimAI, as explained in https://github.com/yassinekdi/naimai.

Many items are still to be enhanced so that’d be awesome if we can improve these items together!

Items to improve with the open source community

Here are some useful items I would like to develop and that I think the community could help for :

Review Generation

The actual method consists of only rephrasing the objective phrase of each paper. I’ve some idea to go further and improve the review generation part. Let me know if you’re interested and we’ll do it together!

Besides the generated text, the references generation still can be brushed up to meet with many references style, and also to export it to other formats (BibTeX..).

Semantic search

The search is mainly based on a v0 semantic algorithm (using TfIdf model mainly). In a previous version, I’ve finetuned bert model for each field and the results were pretty interesting. The problem is that, with 10 fields, I ended up having 10 fine-tuned model. So the usage was pretty slow and the models were heavy (not the best user experience). If you have any idea and/or want to contribute in this part, I’ll be happy to talk to you!

Data papers

I’ve used about 10 millions open access abstracts I found here and there on the internet. If you’ve any source that could be useful, or even better, if we can process much more papers together to get more informations for the users, that’d be cool!

Other

If you want to contribute in any other way, I’ll be happy to hear from you :)

Stay in touch :

Reddit, Twitter, Facebook.