The initial task was to source data of testimonies from various disparate sources within our organisation and blend them into a single dataset. These anonymised testimonies find their way into CV's repositories through reporting mechanisms linked to our evangelistic initiatives. This process involved the careful standardisation of fields like the country of mission and people group, ensuring that the data was ready for advanced processing. This crucial preprocessing phase was accomplished using Python libraries for data manipulation.
Additionally, a key aspect of this phase was the extraction of specific information regarding digital media involvement in the seeker's life-changing events. We utilised a locally hosted LLaMA 2 model, running on Mac M-series hardware, to identify and extract the names of digital platforms mentioned, such as Facebook, WhatsApp, and Phone, which were further standardised across the dataset to ensure uniformity.