The first task is to fetch the complete paragraph using only the first and last few words of it. The second task is classifying the topics of the now-fetched paragraphs using Machine Learning models (a multilabel classification problem).
The models used are (initially) Random Forest and BERT.
The main notebook and the finalized dataframe are main.ipynb and to_fill_finalized_BERT.csv respectively.
The topic_classification_BERT.ipynb notebook contains the full training code and predictions of the BERT model.