## 14.1 – Position Sizing

I know, the discussion on pair trading was to end with the previous chapter, but I thought I had to discuss a special case before we finally wrap up. I’ll also try and keep this chapter really short ☺

So here you go.

I ran through the pair trading algo y’day evening (28^{th} May) and found a very interesting trade. Here are the regression parameters –

- Stock X = ICICI Bank
- Stock Y = HDFC Bank
- ADF = 0.048
- Beta = 0.79
- Intercept = 1626
- Std_err = 2.67

What do you think of it? Perfect isn’t it? Its ICICI and HDFC, two of the largest private sector banks, both have similar business landscape, both have a similar revenue stream, both regulated by RBI. Perhaps the perfect candidate for a pair trade, right?.

The Adf value is 0.048, which means there is only 4.8% chance that the residual is non-stationary or about 95.2% chance of the residuals being stationary, which is fantastic.

The std_err is +2.67, which is a perfect residual value to initiate a short pair trade. The trade here is short HDFC and go long on ICIC.

So, how do we position size this? Here are the price and lot size details –

- HDFC Fut Price = 2024.8
- HDFC Lot size = 500
- ICICI Fut price = 298.8
- ICICI Lot size = 2750

Remember we discussed position size in the previous chapter. We look at the beta and estimate the number of shares required for this trade.

The beta is 0.79, this means, every 1 share of Y needs to be offset with 0.79 shares of X. The lot size of HDFC (Y) is 500, this means to offset the beta, we need 395 shares of ICICI (X).

Do you see the problem here? The lot sizes simply do not match.

We cannot simply trade 1 lot each here like we did in the TATA Motors and Tata Motors DVR example, discussed in the previous chapter. If we do, then this won’t be a beta neutral trade.

Hence to position size this, we need to work around with the lot sizes –

The lot size of ICIC is 2750, beta is 0.79, lot size of HDFC is 500. Given this, that the lot size is higher than HDFC, what should be the minimum number of HDFC shares which will beta neutral 2750 shares of ICICI.

To figure this out, we simply divide –

2750/0.79

= 3481.01

Since the lot size of HDFC is 500, we can round this off to 3500. Considering the lot size of HDFC is 500, this will be 7 lots of HDFC against 1 lot of ICICI.

## 14.2 – Intercept

Alright, now that we know the position size as well, here is the big question – will you take this trade?

Everything seems perfect, right? ADF has a desirable value, residual is at 2.67 SD, the two stocks are highly correlated, the business is similar. So what can go wrong?

Yes, I agree, everything looks good, but on a closer look, the intercept reveals a slightly different story.

To understand this, we need to quickly revisit the regression equation –

** y = Beta * x + Intercept + Residual**

If you think about this equation, we are trying to explain the stock price of Y in terms of the stock price of X multiplied by its beta. The intercept is essentially that portion of the y’s stock price which the model cannot explain, and the residual is the difference between predicted y and actual y.

Going by this, a large intercept implies that a large portion of Y’s stock price cannot be explained by the regression model.

In this case, the intercept is 1626. The stock price of HDFC is 2024 per share, this means, 1626 out of 2024 cannot be explained by the regression equation. This means, the regression equation cannot explain nearly 80% (1626/2024) of Y’s stock price or in other words the equation can explain only 20% of the equation, which according to me is quite tricky.

This further implies, that if we are trading this pair, then we are essentially trading a very small probability here. I’d rather avoid this and look for another opportunity than trade this. Of course, I know traders who would love to jump in and take this trade, but for someone like me, I’d look at risk first and then the reward ☺

Good luck!

Thanks for the further insight

Good luck, Kalpana.

Hello sir

I have a doubt.

HDFC= 0.79*ICICI+ INTERCEPT+RESD

For every 100 shares of ICICI we need 79 shares of HDFCBANK. 1 Lot (ICICI)= 2750. Thus we require 2750*0.79 or 2172 shares of HDFCBANK. Therefore we need 2172/500 or 4 lots (approx) of HDFC to trade.

1 lot of ICICIBANK and 4 lots of HDFCBANK, i think, is a combination.

Thanks

For every 0.79 shares of ICICI, we need 1 share of HDFC. Given the lot size of ICICI as 2750, we need 2750/0.79 = 3481 shares of HDFC, therefore rounded off at 3500.

Thank you, sir. I got it.

Cheers!

Hi Sir. Thanks for bringing in another angle which we couldn’t think of. Its like inhaling a trading wisdom!

Just wanted to find out below;

– What are chances of getting stopped out say for ex: out of 100 trades (Just to asess success rate) both in 1st method pair trade method (correlation) and 2nd method (regression).

I’ve stopped using the first method long ago, Thirumal, and for the 2nd method, I think the success ratio is fairly higher something around the region of 6 of 10 trades working in your favor. The key here is to keep a sharp eye on risk parameters.

Thank you for the information sir

Welcome!

What intercept value is right you think?? Pair I am tracking explains 71.83 of NIFTy(y).

70-80% or higher is a good equation according to me. I’d be hesitant below that.

In today’s session it fell down to 63% , Do you think I should wait before it jumps back above 70% to take the trade.

Yes, that would be better, Akshay. Better safe than sorry 🙂

Hey Karthik,

Just wanted so Thank You, since you started this pair trading chapter I spent most of my time thinking about perfect pair trade setup.

I have finally found out using multiple regression method an equation that explains almost 99% of dependent variable.

Is it too perfect to be true?

99% seems too good to be true, Akshay 🙂

Which pair is this?

I am using more than one independent variable to predict just one dependent variable, like using few heavy weights to predict Index.

Yes, this is possible in a multivariate regression. But we have not really discussed that here 🙂

I tried using regression concept between two stocks but it was quite difficult to find stationarity, so I reduced my time frame & used multivariate regression. This is providing better results & stationarity.

Nice, good luck Akshay!

Hi Karthik,

Amazing write-up!!!

Just a query on hdfc hdfc bank pair. In last 15 days there was a deviation but it never reverted which is strange given it is a highly co-integrated pair.

Request your detailed analysis as a live example on this pair:

1. Whether trade was there in the first place ?

2. Which date would have been the best entry ?

3. Did it hit the stop loss? What should be the SL in such cases ?

4. Is averaging suggested if it moves against you ?

Thanks in advance.

Regards

Deepu

Deepu, I don’t track HDFC and HDFC Bank as a pair so I wont be able to answer this question.

On your query with averaging, no its not a good idea to average it out. Not just with pair, but every other trade.

Hello sir

I too run the regression b/w HDFCBANK(y) & ICICIBANK(x) for last 200 days starting from 7 Aug,2017. On 28-May-18 i got the following data:

Beta= 0.75, intercept=1639, std.error=67.74

But the ADF value was 0.732

On seeing the graph of residual it is clearly evident that the plot was continuously diverging from the mean from last 50-60 days.

Thanks

Hmmm, there is a big difference in the ADF values.

If you wish i could show you the plot. On seeing the plot i couldn’t convince myself that the ADF value could be that low.

Sure Mayank, but currently too many things on the plate. Maybe I’ll look at it one of these days. Thanks for your co operation.

Hello Karthik,

I have been trying to develop the script as you have motivated, currently, i am also getting the same results as KUMAR MAYANK posted above. If you could provide few things from your side, it would be great help to develop my code :

(1) Any 200 Day data of Any two stocks

(2) Output of Linear Regression on these data

(3) Output of ADF Test from your script for these particular data,

If you could provide these then it would be helpful to find out if my script is generating right data or not.

Thanks

Kalpana, you can download the data from here – https://www.nseindia.com/products/content/equities/equities/eq_security.htm and run the liner regression like explained here – https://zerodha.com/varsity/chapter/linear-regression/, you can even download the linear regression sheet. Unfortunately, I wont be able to share the ADF code as it does not belong to me.

https://docs.google.com/spreadsheets/d/1zUUTKwoTT3NfyEsHVEv1EMWJ13li3BF-pUWNlc_0Mmk/edit?usp=sharing

Hello, KarthikJi.

Good Evening,

gone through the last example you shared and it helps me a lot to accurate my trade decisions.

as u mentioned : In this case, the intercept is 1626. The stock price of HDFC is 2024 per share, this means, 1626 out of 2024 cannot be explained by the regression equation. This means, the regression equation cannot explain nearly 80% (1626/2024) of Y’s stock price or in other words the equation can explain only 20% of the equation, which according to me is quite tricky.

what if Intercept Value is negative (-) or sometimes greater than Value of Y stock Price ?

kindly go through the sheet i shared once pls

thanks

Mohit

Interesting point on intercept being -ve. I will get back to you on that soon. Meanwhile, if the beta is -ve, you cannot really trade the pair.

Enjoyed the whole module sir, is there a book or two that you can recommend to further my learning in advanced statistics?

is it prudent to pair HDFC with Nifty, since it has high weight-age in index calculations, also is there any rule that has to be followed while selecting pairs sir?

Hope I am not bugging you with too many questions, I see that you have split automobile segment in 2-wheeler and 4-wheeler, like that is there any sub category in IT and FMCG space?

No Mani, you are not bugging me 🙂

IT can be split as IT large enterprises like HCL, TCS, Wipro, Infy and medium enterprises like Mindtree, Sonata, etc. Same with FMCG.

Mani, we will have to run through the regressing to test to identify if this is worth as a pair.

Check this book, Mani – https://www.amazon.com/Pairs-Trading-Quantitative-Methods-Analysis/dp/0471460672

Hello Karthik ji,

As mentioned above in this case we had 1626 Intercept for Y stock HDFCBANK price at 2024.

In some cases I had negative values of intercept and in some cases it is more than Y stock Price .

Kindly suggest on how to interprete it.

Thanks

Negative values cant be traded I guess. Will get back on that. Dont think it can be more that y’s price.

Hello Karthikji,

Pls clarify the doubt about it, m confused still …

Thanks

What is the doubt, Mohit?

Hello sir

You said that the intercept should be much lower with respect to “y” or “beta times X”. I have a doubt. What if the intercept is low but the variance of it is very high or the value of t-stat is very low? Will you take a trade if all parameters are fine save this?

If intercept is high but its variance is quite low or the value of t-stat is very very high. Will you trade it if all other conditions are met?

Thanks

This is the reason we filter them with the ‘Error Ratio’ 🙂

Sir

Next topic can be on Personal finance

Thanks

Thanks Siva, will give this a thought.

Hi sir, could you please share the latest pair data sheet

Thanks

Here is something you may like – https://twitter.com/ZerodhaVarsity/status/1004200948531585026

Many Thanks

Welcome!

Hi karthik,

How many modules do u plan to write???

I’m tempted to write 2 more – Mutual Funds and Financial Modelling 🙂

What about Quantitative analysis ??

Is there a Chapter on above topic?

This is quantitative, Ravi 🙂

I think, you are missing a module on Qualitative Analysis.

Hello Karthik Sir ,

Will you please consider to write a few chapters on “The ugly side of the markets” like the operators circle , the fake multibagger messages and how prices are manipulated by biggies . If possible how to identify it , either for our benefit or to completely stay away from it . And would also completely love to hear some of stories about this.

Interesting suggestion, Atharva 🙂

Will think about this.

Hi,

Can the ADF plug in utility can be have for a price or can I get in touch with the developer?

Suggest you find a developer for building the algo.

Dear Karthik,

I saw tweet about finishing of this module.

I request you to please consider module regarding excel modeling for analysing stocks, options, futures etc.

Thank you

Pramod

The discussion on Pair Trading is done, Pramod. There is more to discuss in this module. We already have enough content on Futures & Options in Varsity – https://zerodha.com/varsity/

Hello Sir, Thanks for all the modules of the varsity. I am a regular varsity reader and all this materials have just increased my confidence to trade. Was curious to know if this is the last module or you plan to introduce some more concepts?

Also, kudos to Zerodha for attaining the most number of active traders in such a short span. Its a pleasure to have brokers who are great mentors too.

Pritam, I’m so glad you liked the content here. I plan to discuss few more trading systems in the module. Post this module, I may stop adding new content for a while. But I do intend to have a module on financial modeling and personal finance sometime soon.

Hello sir

I got a trade opportunity b/w SBI(X) and Baroda(Y) on 5 June with 200 data points.

Beta= 0.438

Intercept= 28.76 (22% of Y)

Z-Score= -2.678

ADF= 0.0109

But i faced a problem with the position sizing. If i long 2 lots (8000) of Baroda then ideally i should have shorted 3504 shares of SBI whereas one lot of SBI has 3000 shares. If i try to balance the size then capital (8 lacs) requirement goes beyond my reach. Still i have been tracking it on paper with 2 lot of Baroda and one lot of SBI.

I want to know your views on the 1) Trading signal 2) Position size.

Btw, the trade is in green 🙂 and touched the profit of Rs 2600 when zscore was at -2.255. Got some relief after a setback in Baroda-IndianBank pair.

Thank you very much, sir 🙂

ahh.it was a profit of Rs 26000 on paper. Have missed to add one more ‘zero’ 🙂

Hope this translates to real money soon 🙂

Hello sir

I have been tracking the SBI and BOB pair and logging the details. On 11 June the trade was at 28000+ profit with Z-score of -2.11. Now today 18-06-18 the trade ended with the loss of -19500 and z-score 3.11 (dynamic Z-score is -2.84). I checked the adf value and it was 0.1214.

I wish if your comment on this trade, sir 🙁

Thank you

Kumar, its just unfortunate that this pair is taking so much time to converge. This is, in fact, the drawback of pair trading – you need liquidity and patience to hold the pair.

This should also give you a sense of optimizing each pair – not all pair may be worth initiating at 2.5/2.7 and closing at the mean, sometimes, based on the pair, you may want to initiate and close quicker.

Yes, this seems like a valid trade and the best possible position size is 2:1 on BOB:SBIN.

1) Perfectly valid signal

2) 8000 BOB vs 3000 SBI

Ideally, you should close the position when you hit ard -1 on Z-Score.

Is it advisable to track the pairs live during trading hours or end of day better?

Yes, it does. You need to track the Z-score.

hi karthik where can we find the trading pairs list, like u said u will publish the same every week

Will upload the file today in this chapter.

Hi Sir,

Thanks for such a wonderful chapter about pair trading. The way you teach is quite remarkable. Eagerly waiting for the next trading system.

Could you please suggest some books (not pure statistics books) for multivariate regression which is applied for trading along with any case study.

Manoj, unfortunately, there isn’t much content on Pair trading itself, forget Multivariate 🙂

All you can find is few SSRN papers on the concept.

I will probably discuss an easy way to trade calendar spreads next 🙂

Hello Sir,

This was an excellent module, enjoyed reading it.

I wanted to know, while looking at the intercept, what percent do you consider should be explained by equation for Y (dependent variable) ?

Thanks, in advance.

Glad you liked it, Pritam. Intercept should explain at least 70% of Y.

Thank you for writings..

What will be the next module Karthik gaaru

Next chapter will be on Calendar spreads, hopefully next week 🙂

can we get any purchase link for the ADF test plugins

I think Mayank Kumar had suggested few links in one of his comments, maybe you can check that.

Hi Karthik,

You have mentioned in this post that, this post is risky because 80% of HDFCBANK cannot be explained. Can you specify below what % of Std. Error it is safe to trade on the pair. I get that 80% is definitely high. But below how much % would you have considered this trade as safer?

Thanks for the brilliant post on this topic. You have really simplified the paid trading topic.

Pratik, I prefer pairs where the intercept explains at least 70% of the dependent variable.

Hi Karthik,

Thank you for the fantastic work but the topics are very heavy for the common people. Your creation of any topic and teaching method is fantastic. Your way of viewing in each topic always stunned me and I learnt a lot. My small suggestion is Personal finance for the next chapter with lesser math but with new idea.

Sankar, thanks for the kind words and suggestion. I have always wanted to do a module on PF, will try and do the same shortly. Thanks.

In 12 Jun pairdata list y stock bank Baroda and x stock sbin beta neutrality 4000/0.44=9090

Then we need to sell 3 lots of sbin for every 1 lot bank Baroda buy, Is it right?

Thanks in advance sir..

Nope, 2 lots of BoB with 1 lot of SBIN.

Hi Karthik.when can we expect financial modeling…

Sometime this year, Shiva 🙂

Is it possible for you to upload the sheet where we can get the trade signals?

Trying to do that, Shivu. May take some time.

Hi Karthik, The entire strategy series was lovely. I had read linear regression earlier but thanks to you now I am learning to apply them on trading setup. One request/suggestion on the next topic in the line of many such requests. There are many traders who are using Market Profile but there is not very good study material available on web. Can you plan some chapters on this topic & its usability along with price action.

Thanks!!

Glad you liked the content, Sumit. Content on market profile has been on the radar for a while, will try and put this up sometime soon, thanks 🙂

Glad to know that MP is in the radar. thanks!

Cheers!

Dear Karthik,

Can you please tell me how to calculate change in bank nifty ( or any other index) due to change in stocks.

say HDFC bank having weightage around 34 %. If HDFC bank rose by Rs 25 how much likely change in bank nifty considering other stocks being constant.

please help me how to do this in excel for all constituents of bank nifty.

You can use the Nifty Sensitivity tool here – https://www.equitymaster.com/india-markets/nse-replica.asp?utm_source=submenu

Thank u karthik ji .

But only has the nifty and sensex and not bank nifty.

Could please elaborate on how to calculate on excel

Sorry, Pramod. I’m not sure if I understand your query. Can you kindly elaborate? Thanks.

I want to check which stock is driving bank nifty on any single day if bank nifty is in good momentum.

That is why i want to know how much points are contributed by any bank nifty constituent stock say ICICI bank.

Eg

If icici bank changes by 4 points how much change there will be in bank nifty is other stock are assumed to be constant.

Is this kind of working possible in excel? if yes.

Then can u please help in doing this analysis

Thanks you

Equity Master has this sorted out already, check this (look for Nifty Sensitivity) – https://www.equitymaster.com/india-markets/nse-replica.asp?utm_source=submenu

I really like knowledge shared here on varsity. It is very useful to a trader like me to understand various aspect of trading.

I follow this pair trade section particularly.

I found something on ADF TEST.

I DID NOT KNOW SINGLE THIS ABOUT IT

BUT MAY BE MR. KARTHIK TELL US WEATHER IT IS USEFUL OR NOT.

http://www.spiderfinancial.com/support/documentation/numxl/reference-manual/statistical-tests/adftest

I’ve never used this, Vihar, so I really cannot comment on this. But sure does look good enough.

Hi Karthik,

The two pair data sheets uploaded with last 2 chapters have got tata motors & tata motors dvrm pair as both (x & y) & (y & x) each. Does that mean we can trade both of these pair?

Sumit, I think the only valid pair is TMDVR (X) and TM (Y).

Hi Karthik,

I regressed Bank Nifty & Nifty using last 200 days data. The stats coming for 1) BKNifty(x) & Nifty(y) was intercept coeff = 2498 & Std error of intercept/Std error = 2.175. Other pair 2) BKNifty(y) & Nifty(x), intercept coeff = -1283 & Std error of intercept/Std error = 2.55. My q is If we go by smaller error ratio, pair 1) the value of intercept coefficient is big. The value of intercept coefficient is less for 2nd pair but error ratio is big. Should we still go ahead with pair 1)?

We cannot go with a -ve intercept, so your 2nd option is not viable. The first one has a large intercept I guess. So either ways, it may not be a great pair to trade 🙂

Sir please provide commodity pair data sheet too.. Thanks

Have not developed one, Kehav. Will check if this is possible. Thanks.

Hi Karthik, thanks for the prev response!!

I have set up few other trades using regression data from 12th june excel and am tracking the trades. Pardon me for a long post ahead.

My question are on another trade setup between Andhra – Allahabad bank pair. Set up – (Allahabad Bk = y & Andhra Bk = x, Beta = 0.76. Z score was -1.72 at the time of trade). To make a beta neutral pair, I did long on Andhra BK(10000 lot), short Allahabad Bk(10000 lot) and long stocks of allahabad Bk (2346 Stocks in spot). What happened is that trade moved in favour and z score changed to -1.68 but profit went down due to excessive loss in 2346 extra stocks I purchased for beta neutrality. If I remove this long position than my actual profit has gone up since the trade execution.

1: Is it that by using beta neutrality I did a trade off between risk of z-score going further away from mean & amount of profit that pair will make If z score goes toward mean ?

2: Should we make every pair beta neutral?

3: Practically how long can we use same set of regression data without calculating again?

4: If we find a particular pair has stationary residual, will the pair remain intact unless their is substantial change in fundamentals of any stock. In other words, do we need to keep on running adf test on a particular pair very often?

Thanks again

1) Not really, Sumit. Beta neutrality ensures your risk is in check.

2) Yes, at least in this way to pair trade

3) Ideally for every new trade

4) Not really, all else equal, adf remains same.

Hi Karthik,

Thanks for the excellent write up time and again.

Quick observations:

1. The stop loss in most cases is given as 0.5 SD but what I have seen that it goes to+- 3 SD and reverses many a times so is it advisable to keep the stop loss as 3 SD ?

2. Further in highly co-integrated pair like Tatamotors the pairs does eventually reverts so should we hold on to the trade ?

3. Is it always ideal to enter at +-2.5 SD level or we should wait for reversal then make the entry ?

4. After reversal also many a times it does not sustain and starts travelling in the unfavorable direction ? How to identify proper reversal ?

Request your views on the above points.

Thanks in advance.

Regards

Deepu

1) Yes, keeping the SL at 3DS makes sense. So you usually initiate the trade around 2.5 and keep an SL at 3sd

2) Yes, for this you need both patience and conviction 🙂

3) It is best if you can enter around 2.5, things like ‘waiting for reversal’ can add subjectivity in your analysis

4) This is tough

Good luck.

When can we expect another pair data sheet.

Thanks

Keshav, I will try and upload the file sometime today or tomorrow.

Sir have u uploaded the pair data file.?

Ah, no. Will do today.

hiii sir,

can you tell me how to calculate z score of pair stocks and how to calculate sigma and which intercept we have to take while pair trading??

is that intercept work which will revived after regression??

and i am not sure how to calculate z score please help me with this

Ricky, have explained this in the previous chapters. Can you please take a look? Thanks.

yes thanx for your help sir,

i have understand about intercept but still i cant figure out how to calculate Z score

The latest residual divided by sigma gives you the Z-score.

thanx for such a help sir,

and m sorry if i m bothering you,

but how to find sigma

No problem, Ricky.

Sigma = Residual/Standard Error. I’ve explained in this chapter.

i havent find any thing about sigma in this chapter sir,

but thanq so much for your help…

now i have new query.

which residual we have to use for calculation of Sigma??

is it RESIDUAL OUTPUT or ANOVA RESIDUAL which we had after liner Regression

Look for it in the previous chapter.

Sigma = Today’s residual / Standard Error of the residual

yeah i have found it..

its in one line so i have missed it.

my bad..

and again thanx for your help sir..

sir i have read again previous chapter and from i have discover in that chapter you had take live example of Tata motors and Tata motors dvr. in which you calculate std_err as same as you comment me how to calculate sigma so sir my question is you are saying that we can take std_err as Z score to open/exit potions in trade…

do i understand correctly or wrong sir..

Yes, that’s correct, Ricky.

Good luck, Ricky!

thnq very much for your help sir,

this will help me complete my code…

Good luck, Ricky!

Excellent contribution sir.. one query, why one can not perform pair trading using Options ?

Thanks in advance

Pair trading works best when there is only 1 variable to tackle i.e the direction. When there are multiple variables like for an option, you can’t really set up a pair trade.

Sir u r not uploading pair data sheet