Thursday, 2 January 2014

Analytics in Education - Time for Change by Dr.S.Jayaprakash

The role of education in the contemporary world is not mere knowledge creation but creating a ‘bent of mind’ for the students to adapt themselves to this competitive environment. Not all the science students end up as scientists, similarly engineering students as relevant engineers. A chemical engineer lands up as Java programmer and a science graduate works for a courier firm yet they still work exceptionally well with the basic concepts that was taught in their schools and colleges which is called as ‘Bent of Mind’. In the previous century, rank system was implemented to analyse the students (rank as a way of analytics), then grade system emerged (grade as a measure to analyze and categorize) with the growing complexity of measures and opportunities, complex analytics are required to understand the student aspirations, potential to map  the opportunities and their probability to succeed.

The world is becoming more dynamic, volatile and complex today.  It is a universal truth that “Education is the key for the growth of any individual in this complex world”.  Students join the educational stream and aspire to become a successful person relatively with respect to his/her friends, family, peers and others.

The purpose of education is now to create the required ‘Bent of Mind’ for the student to adapt themselves and take up the opportunity. This makes the chemistry graduate a successful accounting employee and an architect as a successful java programmer.

How long can we run successfully on the basis of “Bent of Mind” concept? The existing educational system was created long back to create lot of clerical jobs only. Thanks to the computerization, many aspects of teaching have been computerized.  Innovative teaching models like ‘Computer Based Training” has evolved in the past two decades taking teaching methods to new heights.  Objective based learning has taken center stage and even competitive exams have been oriented towards objective evaluation. The point is that the world has recognized the need for objective based learning thereby preparing the students to win over competitions, thus making a marked shift from the earlier models of teaching.

In summary, teaching methodology and the drive among majority of students has considerably improved.

Couple of decades back, any information about opportunities can be accessed only through word of mouth, newspapers or library only. Now lot of public information is available over web as public data. With the technological evolution, why can’t we derive technology to map the opportunities with the individual potential of the students to excel in their aspirations?

Any common man may observe that various sporting activities have started using analytics as a crucial element to weed out the competition. Let us take an example of cricket, batting styles of batsman are analysed in various angles and bowler focuses his attack on the basis of the findings of the analysis.

So, what does this Analytics Does?

  •  Evaluate students across a set of more than 100 skill sets – It is also customizable to start with simply marks and attendance and expand to more skills 
  • Helps the students, parents, teachers to identify the strengths and weaknesses of the students and channelize the efforts accordingly
  • It also gives an overview of the student activities in a snapshot
  • The analytics will be available on mobiles and can be carried out easily at any place and at any point of time. 
  •  Easy upload options, most schools will be using only the excel sheet for analytics. The analytics will enable the users to upload the same excel sheets for more analytics.
  •  Analytics can be expanded to integrate advanced statistical predictive models as well.
  •  Improves the brand image of the school or college amongst the parents community.

Wednesday, 11 December 2013

Registering and executing statistical models

nanobi analytics platform comes with a few built-in predictive models. These are time series forecasting, classification, association rules, regression and outlier detection.

These can be used as usual analytics with specified declarative options. This needs no R or SAS programming.

However, for advanced statistical modelling requirements, there is way to register bespoke models. After the nanomart is created, register the model and then execute the model. The results are seen in the usual way (analytics and hives).

These operations can be done through UI or programmatically using the REST Web services. This post shows the pertinent web services.

It is assumed that authentication is already done and token is available. The auth token can then be used to call the usual metadata APIs to get AppIDs and MartIDs.

Armed with this, we can start working with model scripts.

Getting a list of mart scripts and models

<server url>/DataObjectServer/data/do?ot=ast&an=nbmdc_nanomarts_row_id&av=<mart id>&o==&tokenid=<auth token>

Example response:
                                "row_id": "fdffcd9d-f50a-4b14-9b4d-712abe49c34c",
                                "is_applied": "n",
                                "describe_name": "",
                                "icon_content": null,
                                "si_id": "106319b8-d03f-402c-a58e-e993d872a233",
                                "icon_path": null,
                                "updated_by_user_id": null,
                                "file_name": "ms_8b244f93-1bdd-43ea-9fef-d0f7cd50eae0_1_vbwyKFHdDYBEkfM.R",
                                "created_by_user_id": "a15cd765-ead8-4794-a32d-ab17bb826ae7",
                                "subscript_type": "R Script",
                                "nbmdc_assets": "my r code",
                                "script_type": "batchcommand",
                                "description": "my r code",
                                "sequence": "1",
                                "nbmdc_nanomarts_row_id": "8b244f93-1bdd-43ea-9fef-d0f7cd50eae0",
                                "created_datetime": "2013-12-10 04:39:58",
                                "active_flag": "y"
                                "row_id": "3cf08efb-c714-473d-bf69-a294fac6caad",
                                "is_applied": "y",
                                "describe_name": "",
                                "icon_content": null,
                                "si_id": "106319b8-d03f-402c-a58e-e993d872a233",
                                "icon_path": null,
                                "updated_by_user_id": null,
                                "file_name": "ms_8b244f93-1bdd-43ea-9fef-d0f7cd50eae0_0_kQTgGZHUMhEFJUg.R",
                                "created_by_user_id": "a15cd765-ead8-4794-a32d-ab17bb826ae7",
                                "subscript_type": "R Script",
                                "nbmdc_assets": "testing r",
                                "script_type": "batchcommand",
                                "description": "testing r",
                                "sequence": "0",
                                "nbmdc_nanomarts_row_id": "8b244f93-1bdd-43ea-9fef-d0f7cd50eae0",
                                "created_datetime": "2013-12-09 07:31:14",
                                "active_flag": "y"

Note: Not that you care, but if you are wondering, "ast" is short for "asset". A model script is internally stored as a mart asset.

Registering a mart script

POST <server url>/NanoClientApplication/martMaintanence
scriptDescription   my r code
script_type batchcommand
subscript_type R Script
scriptSequence 1
via url
fileUrl url/t.R
is_applied n
created_by_user_id a15cd765-ead8-4794-a32d-ab17bb826ae7
martid 8b244f93-1bdd-43ea-9fef-d0f7cd50eae0
si_id 106319b8-d03f-402c-a58e-e993d872a233
token 0e9a4d40-e695-453b-854c-0330313236b3

                "statusCode": "0",
                "statusMessage": "Maintenance script executed."

Note: While executing the script, the following built-in parameters are supported. They need not be explicitly passed to the script.

These parameters can be used in the script and are internally replaced by their values.
Additional command line parameters can be passed when executing the script.

Executing a script

<server url>/DataObjectServer/data/do/serverCommand/execute
tokenid   0e9a4d40-e695-453b-854c-0330313236b3
parameter {"id":"fdffcd9d-f50a-4b14-9b4d-712abe49c34c"}

                "statusCode": "0",
"Successfully executed command.status file url:server/statusFiles/nNFGqJAqSTrtDtF.out"

Wednesday, 30 October 2013

Insurance e-repository - Leveraging Analytics
Following on the heels of the UIDAI initiative comes Insurance Repository, another innovative service.
Currently, a customer has to submit the entire set of KYC documents every time he or she takes up a new policy. Insurance Repository helps dematerialise the policy details, obviating the need to submit a new set of documents.
According to the new facility, a customer enrols with an Insurance Repository services vendor for an e-insurance account which accords the customer a unique ID. All the KYC-related information is fed into this account. After this, any new policy details will be electronically transmitted by the insurer to this account.
Payment and agent details can be viewed through this platform. The customer can also file complaints through this site.
However, the initiative has both pros and cons.


Although it may take time to dematerialise the old policies of the customers, which are in paper format, it’s a good initiative as it helps to reduce the cost of administrative overheads.
This should also help reduce premium rates to some extent in the long run. As a good early bird offer, the insurers should offer a small incentive for e-insurance account opening.
It can track policies across insurers, so that customers can compare theservices in a better manner.
Policy terms and conditions have always been printed in small-sized font. Insurers should now have enough space to print in bigger sizes. The Insurance Regulatory and Development Authority (IRDA) can also think about how to overcome the challenges that were faced earlier due to lack of space and the costs involved.
Undelivered policies are a big problem for insurance companies. E-insurance is one way of dealing with this. Sometimes, undelivered policies could actually be dummy documents used in rackets to make false claims.
Money, time and effort are wasted to store insurance policies, adding to administrative costs. The loss of policies has its costs: inserting an advertisement in the papers, preparing indemnity bonds, and so on. All this can be avoided through e-insurance.
The insurance sector is still struggling to overcome the penetration challenges even after a decade. One reason for this is the difficulty in tracking customer behaviour as the data is only available with the company, or with the IRDA. It’s too big a project for the IRDA to handle.
Now it is possible to keep track of customers across companies in the form of small data marts to analyse by IRDA or through the Insurance Repository services or insurers. This kind of data analysis can help study customer behaviour across products, their investment pattern, loyalty, influence of the agent/distributor/online purchase, premium payment pattern, and so on. This in turn can help customers and insurers plan well across segments. Advanced techniques such as like data mining, self- organising maps and so on can help categorise the customers.
All these analyses can be done without loss of privacy and without revealing competitive information as it is a study only at the macro level; no sensitive information is given out. This, in turn, can help customers, the IRDA, insurance companies, third party administrators, brokers and agents frame new strategies to improve the industry.


It is better to keep track of the proportion of usage based on customer profile analysis;disproportionate usage would sound a warning.
There have been instances when a customer has been issued a policy without his/her own knowledge.
In such cases, fraudsters would misuse documents submitted for some other purpose like opening of a bank account or a credit card application, for this purpose. Or, forged documents are used to compromise the KYC.
Since e-insurance accounts call for one-time KYC, there is potential for fraud. The way to get around this is to impose more stringent norms for KYC so that authenticity is verified in multiple ways like calling customers and visiting them in person.
Also, the process of dematerialising old paper policies should be speeded up. Efforts to integrate the demat account between e-insurance and SEBI, banks are worth considering.
With the e-insurance account becoming like a bank account, and insurance policies having a good investment component, sharing of access details with agents or third parties could be dangerous. Hence, insurers and repository companies should adopt best practices from credit card companies and online banking services.
Although internet has good reach across India, not all rural people have access to it. Financial literacy is yet to catch up.
Hence, the much sought after issues in the insurance sector such as rural penetration and micro insurance policies may have to wait for a while to get into demat stage.
However, insurers and insurance repository companies need to keep a watch on rural usage.
Earlier, insurance policies were delivered to the residence of the rural customer. Under the pretence of getting them the all-inclusive e-insurance account, fraudsters could easily fool rural customers. Flagging such scenarios can help act in the interest of rural beneficiaries.
An overall check by pooling the accounts to avoid duplicate e-insurance account will be a good initiative.
There are instances of people having multiple PAN cards. That should not be repeated for e-insurance accounts. Severe penalties can act as a deterrent.

Tuesday, 22 October 2013

Display target Vs achievement metrics in a bubble chart

This new type chart is ideal for displaying target vs achievement comparisons.

Choose the measures and dimensions.

Set the chart type.


On hive


Real time Data loading

We've added the ability to load data using a web service. This can be used for real time integration.


modetype:  <append/replace/merge>
data:<json formated data>

Example of data to be loaded:

"columns" : ["txn_cd", "client_cd", "security", "location", "client_id", "txn", "qty", "rate", "brokerage", "ser_tax"],
"rows" : [
 ["1", "P0001003", "Cl001", "BLR", "719", "Buy", "50.000", "133.350", "0.120", "1.210"],
 ["2", "P0001004", "Cl002", "BLR", "720", "Sell", "59.000", "108.750", "0.080", "1.090"]

Support for 'DEFAULT' clause in the Mart definition SQL


The script can have a DEFAULT clause now. The data file being uploaded need not mention the column that has a DEFAULT set.

The mart table will have the column filled automatically.

Supporting additional mart indices for performance tuning

Mart maintenance can be used for creating (or maintaining) indices. Handy way to do performance tuning sometimes.

This mart has no index other than the primary key index.


We write the mart maintenance script.


Register this script.


The new index is created.

In appstore environments, the "Apply to Subscribers" checkbox can be checked so that app subscribers also get this index.