Can data science be used for testing scoring?
  • 24 Mar 2022
  • 5 Minutes to read
  • PDF

Can data science be used for testing scoring?

  • PDF

Article Summary

Introduction
Taking feedback from our community that testing the scoring methodology has its own challenge. It made us ask the question can we take the leverage of data science and code to build something useful that the Research Analyst may be already thinking. In today's content, we would like to take a leap of faith and share some code that could be the building block for creating test data. Thus giving you an option to run some machine learning model based on these test data. If you are interested in talking more about it say Yes in the comments of the original post here.

Access to Seeds Index

The index measures and compares the efforts of the world’s leading seed companies to enhance the productivity of smallholder farmers. Matching the expectations of stakeholders in and around the seed industry with company performance helps to clarify the role of the industry. It also brings transparency to the contributions of individual companies. Index findings contribute to an informed dialogue on how companies can step up their efforts.  You can learn more about it in our previous content here.

Scoring approach

Scoring takes place at the indicator level. The index has used a three-point scale with one-point increments of 0, 1, and 2; and a five-point scale of 0, 0.5, 1, 1.5, and 2. In each case, a score of 0 typically reflects no relevant disclosure and a score of 2 reflects leading performance. There are weighing and measurement areas.

Problem Statement

The Research Analysts want to make sure they have enough examples of the company's entries to test the final scores. Thus they can verify the leading and lagging companies. It's important for them because they want to make sure the weightage is added properly and the final score adds up. In some cases, they would also like to project year on year growth of the companies and run some probabilities. It would be great to see some scorecards and dashboards based on different scores, especially the sector, pillars or country graphs. We have enough machine learning algorithms to run some classification on different companies' test data. It can bring some new insights to the surface.

First Step

We are taking one tiny step to create test points for each indicator for Indicators A & B. There are C, D, E & F Indicators we have kept out of scope. Each company will have 9 test outputs for indicators A1, A2, B1, B3, B4, B5 & B6. One important distinction is that some are three-point scale and five-point scale. Example: A1 = five-point scale and B4 = three-point scale. The code will create test output for 20 companies.

We will explain some of the control points of the code in some time. The purpose of this content is to show you some output and get feedback on that is there any test data that you would like to create by building on this code. In order to do that you can do these 3 steps to run this code now.

  1. Copy the code below mentioned below in the box under the heading Code
  2. Paste the code in this online tool https://www.programiz.com/python-programming/online-compiler/ 
  3. Click on the blue Run button at the top centre of the editor. The test data for 20 companies will be created on the right-hand pane. See the example image under the heading Output

I ran some python code today! Click to Tweet

Code

import numpy as np

#Each value is diffrent scale three-point scale & five-point scale
# Indicator mapping      A1,A2,A3,B1,B2,B3,B4,B5,B6
SeedScoring  = np.array([5, 5, 5, 5, 5, 5, 3, 3, 5])
CompanyCount = 20

FivePointScale  = np.array([0.0,0.5,1.0,1.5,2.0]) #five-point scale
ThreePointScale = np.array([0.0,1.0,2.0]) #three-point scale

PointScales = {
  5: FivePointScale,
  3: ThreePointScale}

def TestData(Pointer):
    np.random.shuffle(Pointer)
    return Pointer[0];

def SeedTestData(pt):
    return TestData(PointScales[pt])

def CompanyTestScore():
    vfunc = np.vectorize(SeedTestData)
    return repr(vfunc(SeedScoring))

for _ in range(CompanyCount):
    print(CompanyTestScore())

Output

Do you want to join our Scoring Method Call to discuss this further?
We want to see if this is useful or does it have the potential to improve. I think it will be interesting for the community. The next step could be to have a 1-hour call with at least 3 members of this community. We can talk about its use case. We can share more explanations of this tiny code. You can let us know by mentioning Yes in the comment of the original post here.



Code Dive-in

We want to explain the control points from the utility point of view:

  1. Scoring Method (line no 5)
  2. Company Count (line no 6)
  3. Point Scale(line no 8 & 9)
  4. PointScales (line no 11 to 13)

Scoring Method (line no 5)

#Each value is diffrent scale three-point scale & five-point scale
# Indicator mapping      A1,A2,A3,B1,B2,B3,B4,B5,B6
SeedScoring  = np.array([5, 5, 5, 5, 5, 5, 3, 3, 5])

The values you can change here are the ones inside the square brackets []. Each number here represents if it is a five-point scale or three-point scale for each indicator. So the first number is for A1(five-point scale), the second number is for A2 (five-point scale) and so on. So if you want to add the scoring method for the C indicator. Example: C1, C2, C3, C4, C5 & C6 will be 5,5,5,5,5,5 the updated code will be to append point scale in the original code. It will look like this SeedScoring  = np.array([5, 5, 5, 5, 5, 5, 3, 3, 5, 5, 5, 5, 5, 5, 5])

Company Count (line no 6)

CompanyCount = 20

This is straight forward you can replace the number 20 with the number of companies you need. Keep it in the range of up to 500. Remember it is a play code.

Point Scale (line no 8 & 9)

FivePointScale  = np.array([0.0,0.5,1.0,1.5,2.0]) #five-point scale
ThreePointScale = np.array([0.0,1.0,2.0]) #three-point scale

These 2 lines represent two different point scales. If tomorrow you have a two-point scale. 

You can add a new line is TwoPointScale = np.array([0.0,2.0]) #two-point scale

PointScales (line no 11 to 13)

PointScales = {
  5: FivePointScale,
  3: ThreePointScale}

This block of code will only change if you have new point scales as we added a two-point scale. You will insert it inside this dictionary. It will look like this. It can get a bit tricky but you have to be accurate with the spelling of TwoPointScale and add an additional comma for 3

PointScales = {
  5: FivePointScale,
  3: ThreePointScale,
  2: TwoPointScale}
Do you want to join our Scoring Method Call to discuss this further?
We want to see if this is useful or does it have the potential to improve. I think it will be interesting for the community. The next step could be to have a 1-hour call with at least 3 members of this community. We can talk about its use case. We can share more explanations of this tiny code. You can let us know by mentioning Yes in the comment of the original post here.

References:

  1. 2021 Access to Seeds Index – Scoring guidelines - https://assets.worldbenchmarkingalliance.org/app/uploads/2021/09/2021-Access-to-Seeds-Index-Scoring-guidelines.pdf
  2. Programiz | Python Online Compiler - https://www.programiz.com/python-programming/online-compiler/

Was this article helpful?