A Brief Glimpse into the Future of Data Science in Mineral Exploration
The Oz Minerals-Unearthed Solutions Explorer Challenge wrapped up Thursday night (PDT) with a deep sense of enthusiasm for the future of crowd sourcing and using data science in target generation for mineral exploration. Congratulations to all the winners! Your projects were impressive. Overall, the judges, experts and the Unearthed Team were very impressed with the quality of the submissions.
While my entry didn't win a prize, I am very proud of the work. I am also very pleased with my results as I was able to build-up and utilize the aspects that had been removed from my Ph.D. thesis project, due to my committee's perceived lack of time, and apply them to a new area.
With an eye towards the future, there were a number of very thoughtful insights that were mentioned during the live broadcast regarding the future of data science in mineral exploration, modelling exploration targets and how the data sets are used.
1. Don't be left behind! Five years from now investors may require a measured level of success in finding ore for the project you want them to invest in.
In addition, investors may not be the only group looking for a quantifiable likelihood of finding ore: government agencies, NGO's (non-government agencies) and local stakeholders may also start asking for statistical validation of finding ore as tangible proof for exploring within a specified area.
2. Different models created a variety answers while sometimes using the same data.
This is one of the hard sells for using data science in mineral exploration: there doesn't seem to be a consistent answer ... when you think about it though this is somewhat similar to different geologists providing different models for the same project area. The cause of both quandaries is, however, very similar.
A geologist's background, experience and sometimes education (e.g., who their thesis advisor/professors were) will invariable affect their answer whether it's discussing a rock type or deposit model. In the end, drill assays will prove them right or wrong.
The information used in data science, how it is prepared and what you are looking for will greatly affect the results of the model. There are a number of studies comparing different model methods using the exact same data to show the differences in efficiency or accuracy between models (e.g., the amount of data required to predict ore locations or how well the model is able to predict the location). These studies result in very similar answers only the performance of the model changes. When different answers occur it is typically the result of asking a different question (e.g., defining a specific deposit type vs. looking for any mineralization) or preparing the data differently ... which brings me to the next insight.
3. Data cleaning and fit-for-purpose.
Invariably the data you want to use isn't in the format you need whether it is spread among multiple files or the units of measurement are different. In addition, you need to know the strengths and limitations of your data in order to set up an appropriate model. For example, does you data consist primarily of results only for your chosen commodity (such as gold fire assay) or does it include your commodity with a suite of elements (such as aqua regia multi-element data)? The focused data will allow you to hone in on, or limit your model to, the locations and abundance of your element of interest (like a conventional resource model). The multi-element data may limit resource predictions but will allow for the discovery of mineralized systems to possibly uncover drill-target-near-misses and alteration halos. A data scientist will spend the majority of their time cleaning the data and exploring the data looking for the strengths and limitations prior to creating a model.
Geochemical data is probably one of the most difficult data sets to manage, keep clean and use appropriately. There are a wide variety of assay methods that preferentially dissolve some minerals better than others; the data is in the form of ratios that, without transformation, is not appropriate for use in multivariate statistics; and there is the added challenge of below detection limit values that are not zeros as well as above detection limit values that are unknown. For more information check out my post on Understanding Geochemical Data.
All in all, this was a very fun challenge to be a part of and I learned a lot along the way including the need for a more advanced computer to tackle huge models. I am looking forward to the next challenge.
About the Author:
Diana has over 20 years of experience working in the mineral exploration industry searching for diamonds and metals in a range of roles: from heavy minerals lab technician to till sampler, rig geologist, project manager and business owner. Following a Master of Science degree in diamond indicator mineral geochemistry, Diana has conducted field work in BC, NWT, YT, ON (Canada) and in Greenland. She has also been involved, remotely through a BC-based office, on mineral exploration projects located in South America, Africa, Eurasia, and the Middle East. Diana finished a Ph.D. at UNBC in 2017 researching geochemical multivariate statistical analysis and interpretation. Currently, Diana is the owner of Takom Exploration Ltd., a small geological and environmental consulting firm focused on metal exploration in BC and the Yukon.