XBRL Financial Analytics Platform (Alpha 1.0): First release

Posted on Tue 25 December 2012 in XBRL • Tagged with financial analytics, python, python xbrl, xbrl, xbrl analytics, XBRL Financial Analytics, xbrlfinappLeave a comment

Hello friends,

Today i am releasing first alpha release of my xbrl analytics platform.

Its URL is: http://xbrlfinapp.pythonanywhere.com/

Its a web application, targeting users whose XBRL is on SEC/or not on SEC/or not in public domain yet/or anyone who is interested in XBRL data and want to do the financial analytics.

At present it is only working for US-GAAP TAXONOMY VERSIONS 09,11,12.

Website's layout and design is simple; just upload valid zipped xbrl package and see the result on Homepage only. Upload file size has limit of 5MB. If you want to increase that limit then please show me that xbrl.

Limited numbers of financial formulas(Key performance Indicators) have been included; because current website design and hosting type is restricting them. But these formulas are using data from XBRL package only; no access  to market information(like; share price).

Well, all those improvements are in progress.

This app represents final analysed data in form of table;similar how arelle is loading xbrl file. But its not. It is doing complex algorithm of data finding.

For example; consider two cases as; 1st: " Correctly determining what total equity is if an SEC filer does not provide total equity" 2nd: "Correctly determining what total revenues is when the filer does not provide total revenues or when they use obscure concepts to express revenues"

Lets take this challenge to my app; first download one xbrl filling from SEC from here and upload it to xbrlfinapp application.

case 1. Filler has not provided total equity concept("us-gaap_StockholdersEquity"); and my application has find out its three components, they are: RetainedEarningsAccumulatedDeficit,  CommonStockValue,  AccumulatedOtherComprehensiveIncomeLossNetOfTax.

case 2. Filler has not provided 'total revenues' concept('us-gaap_Revenues'); and then in such case my application has find out its two components, they are; InterestExpense,  SalesRevenueNet

Similarly filler has not provided 'Pre-Tax Income Loss' concept('us-gaap_IncomeLossFromContinuingOperationsBeforeIncomeTaxesExtraordinaryItemsNoncontrollingInterest');
and then my application has find out its two components, they are; IncomeLossFromEquityMethodInvestments,  IncomeLossFromContinuingOperationsBeforeIncomeTaxesMinorityInterestAndIncomeLossFromEquityMethodInvestments.
Other cases like; filer has not provided 'Total Costs and Expenses' concept (us-gaap_CostsAndExpenses) and then my application has find out its five components as; InterestExpense, CostOfGoodsSold, OtherPostretirementBenefitExpense, SellingGeneralAndAdministrativeExpense, PensionExpense.

but Current Assets(us-gaap_AssetsCurrent),Current liabilities(us-gaap_LiabilitiesCurrent), Total Assets(us-gaap_Assets) ,Operating Income Loss(us-gaap_OperatingIncomeLoss) are present; so they are not recalculated again.

By this way; you can upload any xbrl package from SEC(under UGT version 09,11,12).

Screen shot 2012-12-24 at 9.21.29
PM Screen shot 2012-12-24 at 9.22.10
PM

Also there might be some bug or error in doing analysis; if you find them then please inform me at 'namitkewat@gmail.com'


Super-charging Python code with C-Extensions or Cython

Posted on Sun 16 December 2012 in XBRL • Tagged with Cython, financial analytics, python, xbrlLeave a comment

Regarding my previous post that was Financial Analytics in XBRL was in pure-python with little bit 'C' touch (because of my base xml parsing lib is cElementTree). It was taking 20-21 mins to process 200 xbrl packages.

But thats not the end; I have optimized it with Cython. Its a C-extensions in python, giving multiple folds increase in the performance.
I have just converted my raw module to '.so' and then imported it as usual; and i found 25% reduction in time. Its now near 14-15min for the same data set of 200 xbrl fillings.

See the screen-shot which i have just taken; 

Image

Here i have not implemented any type declaration or typed arguments; and yet i got 25% reduction in overall time. So you can imagine what will happen when i will implement those changes.
Anyway; 'C' rocks!!!!!


Large scale data analytics in XBRL using python

Posted on Tue 20 November 2012 in XBRL • Tagged with financial ratios, python, python xbrl, sec fillings, xbrl analyticsLeave a comment

Hello all,

Just now i am completing my 8 months old experiment. It was to develop something with use of XBRL which can reduce the dependency on financial information which are processed by 3rd party.

I mean that the current dependency on Yahoo finance/Bloomberg/Google Finance/9W search/i-metrix_edgar-online/.. . etc for the financial information should not be there.

I am working as financial analyst doing XBRL tagging/verification/QC of financial information.

I thought if we can tag this information then we can use it for analytics purpose.

So lets build it. And now its almost done. I have just run my program on 200 companies and it took only 20-25 min to analyze them.
You can see the output in the link given below. Its for 200 SEC's filling. (Not for entire quater; because it took me 3 hours to download them and just 20-25min to analyze them).

Download 200 filling's analysed Data

I am covering 21 main financial ratios; and around 20-21 financial facts.

Ratios are also covered for dimension's case.
For example company "OPLINK COMMUNICATIONS INC (0001022225)"'s "Fixed turnover ratio" for "UnitedStatesMember" is "10.7756" while "ChinaMember" has "1.2279" for period of 2011-07-04/2012-07-01.

You can see the how much detail can be pulled out if XBRL document is well formed.

There was a time when it was taking more than 3 hours for 10% of current task. And at present it was just 20-25 mins on my Intel core2 duo/2GB RAM. Even then my instinct tells me that it can go down further.

Well to do that; i am really thankful of "Python". It was great experience for me. Entire coding was done in python only.
I hope what python has given in field of Genome sequencing/Astronomy/ and Physics will be repeated in finance as well.

Scope of such program is in big data centers to small investors; like stock-exchange,financial information site like google-finance where authorities has to perform custom/personalized calculations on the financial data; also end user like you and me.

If you have more formulas to include that can be easily calculated from XBRL then please mention it as a comment below;

For more detailed study download this text file which contains the analysis of Google's and 3M's filling by my program: Google and 3M

Any query regarding my development; then email me at "namitkewat@gmail.com"

and any suggestion for current work??


xbrlMapper- "Facts": my first attempt to Arelle's Fact-List

Posted on Mon 17 September 2012 in XBRL • Tagged with arelle, python xbrl, xbrl instance facts, xbrlMapperLeave a comment

I was just thinking that what i can do with my xbrlMapper module, so i thought lets try to make something  like arelle's fact-list(3rd and last image of this post; 'www.arelle.org").

I have extended my present library; added new method "reportGen"; a methods with returns dicts of instance document's facts; and i really love dicts; specially after watching pycon-2010 talk on dict "The Mighty Dictionary".

See the snaps i am uploading to this post. Though the speed of process is fast but i am not satisfied because if it now takes 2.5sec for one xbrl package then how much time it will take for 20,000 xbrl packages!!!!

You know when i have started working on this , it has taken 71 seconds in first attempt(in evening of day one), it was like i am in stone-age  #\$@#\^@%#&.
So i have decided to improve it, then 25 sec, 20 sec(7 PM.. .of day one) then 35 sec- 40 sec(11 PM.. day 1).. .lollzz.. that evening was very funny, all time it was moving around 30 seconds!!! and then suddenly 9.5 sec(at 00:30AM of day2), and then 4.5 sec(aaa...5:30AM day 2), then finally its near 2-2.5 sec(..day 2..7:30AM.. then i left it because i have to go to my office), but my intuition is telling me that we can reduce it even further.

[gallery link="file" columns="2"]

Here in fist image; its snap-shot of Arelle's fact-list, in second  shows amount of python code i have to write to achieve this, and finally in last image... final output, its a sqlite db view in firefox plugin.


Processing files larger than RAM in python

Posted on Tue 11 September 2012 in General • Tagged with large files, mongo-db log reading, python, reading files larger than RAM in pythonLeave a comment

I got a log file of 131 GB from Mongo-DB that has been collected in last 6 months. So I just thought lets calculate the number of lines.

Here files size is greater than RAM, so after searching over internet; i got solution on http://stackoverflow.com.

I have modified it a little bit! But work is done.

It took around 75 min. and total number of lines was around 2.1 Billion.