{"id":1855,"date":"2024-12-26T15:19:58","date_gmt":"2024-12-26T06:19:58","guid":{"rendered":"https:\/\/www.yanagichiaki.jp\/?p=1855"},"modified":"2024-12-27T01:08:40","modified_gmt":"2024-12-26T16:08:40","slug":"introductory-econometrics-homework-lab","status":"publish","type":"post","link":"https:\/\/yanagichiaki.jp\/index.php\/2024\/12\/26\/introductory-econometrics-homework-lab\/","title":{"rendered":"Introductory Econometrics Homework &amp; Lab"},"content":{"rendered":"\n<p class=\"is-style-big_icon_point\">\u30cf\u30eb\u30d3\u30f3\u5de5\u696d\u5927\u5b66\uff08\u6df1\u5733\uff09\u2022 2024 \u2022 \u5165\u9580\u8a08\u91cf\u7d4c\u6e08\u5b66 Homework &amp; Lab \u2022 \u306b\u304a\u3051\u308b\u89e3\u6c7a\u7b56 \u2022 HITSZ \u57fa\u7840\u8ba1\u91cf\u7ecf\u6d4e\u5b66\u4f5c\u4e1a \u2022 \u5b9e\u9a8c 2024<\/p>\n\n\n\n<p class=\"has-border -border03 is-style-icon_info\">\u5fa1\u8cea\u554f\u304c\u5fa1\u5ea7\u3044\u307e\u3057\u305f\u3089\u3001\u3053\u306e\u30da\u30fc\u30b8\u306e\u4e0b\u90e8\u306b\u3042\u308b\u30b3\u30e1\u30f3\u30c8\u6b04\u3092\u5fa1\u5229\u7528\u304f\u3060\u3055\u3044\u3002<br><span class=\"swl-marker mark_yellow\">\u4ef0\u305b\u4e8b\u6709\u4e4b\u5019\u30cf\u30cf<\/span>\u3001<span class=\"swl-marker mark_blue\">\u6b64\u4e01\u4e4b\u4e0b\u30cb\u30a2\u30eb\u610f\u898b\u4e4b\u6b04\u30f2\u7528\u30f0\u7d66\u30d8<\/span>\u3002<\/p>\n\n\n\n<p class=\"is-style-big_icon_batsu\">\u5f53\u30b5\u30a4\u30c8\u5185\u306e\u30b3\u30f3\u30c6\u30f3\u30c4\u306e\u7121\u65ad\u8ee2\u8f09\u3001\u5f15\u7528\u3001\u30b3\u30d4\u30fc\u306f\u7981\u6b62\u3055\u308c\u3066\u3044\u307e\u3059\u3002<\/p>\n\n\n\n<p class=\"is-style-big_icon_caution\">For those titles or questions with at least one \u2018+\u2019 mark, it shows that the corresponding part is of the course numbered &#8220;ECON2010&#8221; as an extra part than &#8220;ECON2010F&#8221;, which is an easier alternative Introductory Econometrics course.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Homework 1<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">1 : [15 points : Theory]<\/h3>\n\n\n\n<p>Remind yourself of the terminology we developed in Chapter 1 for causal questions. Suppose we are interested in the causal effect of having health insurance on an individual&#8217;s health status.<\/p>\n\n\n\n<p>(a) [<span class=\"swl-marker mark_green\">2 points<\/span>] We run a phone survey where we ask 5,000 respondents about their current insurance and health conditions. The data we collect is an example of a __________.<\/p>\n\n\n\n<p>(b) [<span class=\"swl-marker mark_green\">2 points<\/span>] The US government has Census data on every elderly American&#8217;s current insurance and health status. This is an example of data for the __________.<\/p>\n\n\n\n<p>(c) [<span class=\"swl-marker mark_green\">2 points<\/span>] Suppose we take our phone survey data and calculate the difference in health between individuals who do and do not have insurance. This difference is an example of an __________.<\/p>\n\n\n\n<p>(d) [<span class=\"swl-marker mark_green\">4 points<\/span>] The difference in health between all Americans who do and don&#8217;t have insurance is an example of an __________. The effect of insurance on health is an example of a __________.<\/p>\n\n\n\n<p>(e) [<span class=\"swl-marker mark_green\">5 points<\/span>] When the two objects in (d) coincide, we have an example of __________. Give one reason why the two objects in (d) might not coincide.<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (a)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>sample<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (b)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>population<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (c)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>estimate (or estimator)<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (d)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>estimand<br>(target) parameter<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (e)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>identification<br>We might expect richer individuals to be more likely to have health insurance and more likely to be healthy for other reasons. In this case the difference in health of Americans with\/without health insurance is likely to overstate the causal effect of insurance (upward selection bias).<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\">2 : [25 points : Theory]<\/h3>\n\n\n\n<p>Let $Y=a+X^{3}\/b$ where $a$ and $b$ are some constants with $b&gt;0$, and where $X\\sim\\mathrm{N}(0,1)$.<\/p>\n\n\n\n<p>(a) [<span class=\"swl-marker mark_green\">2 points<\/span>] State the definition of the cumulative density function of $Y$, which we&#8217;ll call $F_{a,b}(y)$.<\/p>\n\n\n\n<p>(b)&nbsp;[<span class=\"swl-marker mark_green\">5 points<\/span>] Express $F_{a,b}(y)$ in terms of the CDF of the standard normal distribution $\\Phi(\\cdot)$. Hint: can you re-write the inequality $Y\\le y$ as an inequality involving $X$?<\/p>\n\n\n\n<p>(c)&nbsp;[<span class=\"swl-marker mark_green\">3 points<\/span>] Express $E[Y]$ in terms of $E[X^{3}]$, then use the fact that $E[X^{3}]=0$ when $X\\sim\\mathrm{N}(0,1)$ to derive $E[Y]$.<\/p>\n\n\n\n<p>(d)&nbsp;[<span class=\"swl-marker mark_green\">4 points<\/span>] Express $Cov(Y,X)$ in terms of $E[X^{4}]$, then use the fact that $E[X^{4}]=3$ when $X\\sim\\mathrm{N}(0,1)$ to derive $Cov(Y,X)$.<\/p>\n\n\n\n<p>(e)&nbsp;[<span class=\"swl-marker mark_green\">2 points<\/span>] Suppose $E[Y]=0$ and $Cov(Y,X)=0.3$. What can you conclude about $a$ and $b$?<\/p>\n\n\n\n<p>(f)&nbsp;[<span class=\"swl-marker mark_green\">6 points<\/span>] Given your answers to (b) and (e), what is the probability that a draw of $Y$ is bigger than zero? What is the probability that a draw of $Y$ falls between $-0.1$ and $0.1$?<\/p>\n\n\n\n<p>(g)&nbsp;[<span class=\"swl-marker mark_green\">3 points<\/span>] Let $W=a+X^{3}\/b+Z$ where $Z$ is mean-zero and independent of $X$. How does the distribution of $E[W\\mid X]$ (recall this is a random variable) compare to the distribution of $Y$?<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (a)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>By definition, $F_{a,b}(y)=Pr(Y\\le y)$.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (b)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>We have\\begin{align*} Y\\le y &amp; \\iff a+X^{3}\/b\\le y\\\\ &amp; \\iff X\\le\\sqrt[3]{b(y-a)} \\end{align*}using the facts that $b&gt;0$ and that $f(x)=x^{3}$ is increasing. Thus $Pr(Y\\le y)=Pr(X\\le\\sqrt[3]{b(y-a)})=\\Phi(\\sqrt[3]{b(y-a)})$.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (c)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>$E[Y]=E[a+X^{3}\/b]=a+E[X^{3}]\/b$ by linearity of expectations. So with $E[X^{3}]=0$, $E[Y]=a$.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (d)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Since Formula $E[X]=0$,\\begin{align*}Cov(Y,X) &amp; =E[YX]\\\\ &amp; =E[aX+X^{4}\/b]\\\\ &amp; =abE[X]+E[X^{4}]\/b\\end{align*}by linearity of expectations. With $E[X^{4}]=3$ and again $E[X]=0$, we thus have Formula $Cov(Y,X)=3\/b$.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (e)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>If $E[Y]=0$ we know from (c) that $a=0$. If further $Cov(Y,X)=0.3$ we know from (d) that &nbsp;$3\/b=0.3$ or &nbsp;$b=10$.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (f)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Given (b), \\begin{align*}Pr(Y&gt;0) &amp; =1-Pr(Y\\le0)\\\\ &amp; =1-\\Phi(\\sqrt[3]{b(0-a)}).\\end{align*}Plugging $a=0$ into this expression yields \\begin{align*}Pr(Y&gt;0) &amp; =1-\\Phi(\\sqrt[3]{b(0-0)})\\\\ &amp; =1-\\Phi(0)\\\\ &amp; =0.5.\\end{align*}Similarly, plugging in both $a=0$ and $b=10$,\\begin{align*}Pr(-0.1\\le Y\\le0.1) &amp; =Pr(Y\\le0.1)-Pr(Y\\le-0.1)\\\\ &amp; =\\Phi(\\sqrt[3]{b(0.1-a)})-\\Phi(\\sqrt[3]{b(-0.1-a)})\\\\ &amp; =\\Phi(\\sqrt[3]{10\\times0.1})-\\Phi(\\sqrt[3]{10\\times-0.1)})\\\\ &amp; =\\Phi(1)-\\Phi(-1)\\\\ &amp; \\approx0.84-0.16\\\\ &amp; =0.68.\\end{align*}<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (g)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>We have \\begin{align*}E[W\\mid X] &amp; =E[a+X^{3}\/b+Z\\mid X]\\\\ &amp; =a+X^{3}\/b+E[Z\\mid X]\\\\ &amp; =a+X^{3}\/b\\\\ &amp; =Y\\end{align*}since $E[Z\\mid X]=E[Z]=0$. Thus $E[W\\mid X]$ and $Y$, being equal, have the same distribution.<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\">3 : [25 points : Empirics]<\/h3>\n\n\n\n<p>Let&#8217;s prove your answer to 2(d) by simulation.<\/p>\n\n\n\n<p>(a) [<span class=\"swl-marker mark_green\">6 points<\/span>] Create a Stata program that generates a dataset with $N=10,000$ independent draws of a standard normal variable $X_{i}\\stackrel{iid}{\\sim}\\mathcal{\\mathrm{N}}(0,1)$, generates $Y_{i}=a+X_{i}^{3}\/b$ for the values of $a$ and $b$ you found in 2(e), and computes the sample covariance $\\widehat{Cov}(X_{i},Y_{i})$. Run the program a few times. How does this exercise build confidence in your answer to 2(d)?<\/p>\n\n\n\n<p>(b)&nbsp;[<span class=\"swl-marker mark_green\">5 points<\/span>] Run the same program once with $N=10$. Does the result shake your confidence in your answer to 2(d)? Explain.<\/p>\n\n\n\n<p>(c)&nbsp;[<span class=\"swl-marker mark_green\">8 points<\/span>] Modify your program to automatically compute and store $500$ simulated values of $\\widehat{Cov}(X_{i},Y_{i})$ with $N=10$ after fixing the seed to $1630$. Report the average simulated value. How does it compare to what you&#8217;d expect from your answer to 2(d)?<\/p>\n\n\n\n<p>(d) [<span class=\"swl-marker mark_green\">6 points<\/span>] How does the mean and variance of the $500$ simulated $\\widehat{Cov}(X_{i},Y_{i})$ change as you increase $N$ from $10$ to $100$? What do you expect to happen as you increase $N$ further?<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (a)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">set matsize 5000\nset seed 12345\nforval rep=1\/5 {\n\tclear\n\tset obs 10000\n\tgen X=rnormal()\n\tgen Y=0+X^3\/10\n\tcorr X Y, cov\n}<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. set matsize 5000\n\n. set seed 12345\n\n. forval rep=1\/5 {\n  2.         clear\n  3.         set obs 10000\n  4.         gen X=rnormal()\n  5.         gen Y=0+X^3\/10\n  6.         corr X Y, cov\n  7. }\nnumber of observations (_N) was 0, now 10,000\n(obs=10,000)\n\n             |        X        Y\n-------------+------------------\n           X |  .993742\n           Y |  .300913  .153776\n\nnumber of observations (_N) was 0, now 10,000\n(obs=10,000)\n\n             |        X        Y\n-------------+------------------\n           X |  1.01913\n           Y |  .316717  .164776\n\nnumber of observations (_N) was 0, now 10,000\n(obs=10,000)\n\n             |        X        Y\n-------------+------------------\n           X |  1.00079\n           Y |  .298588  .146994\n\nnumber of observations (_N) was 0, now 10,000\n(obs=10,000)\n\n             |        X        Y\n-------------+------------------\n           X |  1.00011\n           Y |  .297844  .145352\n\nnumber of observations (_N) was 0, now 10,000\n(obs=10,000)\n\n             |        X        Y\n-------------+------------------\n           X |  1.00243\n           Y |  .301918  .152687<\/code><\/pre>\n\n\n\n<p>After setting the seed to $12345$, $a=0$, and $b=0.3$, I ran my program five times and got sample covariances of $0.301$, $0.317$, $0.299$, $0.298$, and $0.302$. These are all somewhere around the $0.3$ I expected from the above.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (b)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">set seed 12345\nforval rep=1\/1 {\n\tclear\n\tset obs 10\n\tgen X=rnormal()\n\tgen Y=0+X^3\/10\n\tcorr X Y, cov\n}<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. set seed 12345\n\n. forval rep=1\/1 {\n  2.         clear\n  3.         set obs 10\n  4.         gen X=rnormal()\n  5.         gen Y=0+X^3\/10\n  6.         corr X Y, cov\n  7. }\nnumber of observations (_N) was 0, now 10\n(obs=10)\n\n             |        X        Y\n-------------+------------------\n           X |  1.06192\n           Y |  .586814  .436831<\/code><\/pre>\n\n\n\n<p>With the same seed and parameter values I get now a sample covariance of $0.587$, which is very different from $0.3$. But I&#8217;m not too worried about it, since this simulation uses a small sample. We expect by chance the sample covariance to be far from the \u201cpopulation\u201d covariance.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (c)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">set seed 1630\nmatrix results=J(500,1,.)\nforval rep=1\/500 {\n\tclear\n\tqui set obs 10\n\tgen X=rnormal()\n\tgen Y=0+X^3\/10\n\tqui corr X Y, cov\n\tmatrix results&#91;`rep',1]=r(cov_12)\n}\nclear\nsvmat results\nsumm<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. set seed 1630\n\n. matrix results=J(500,1,.)\n\n. forval rep=1\/500 {\n  2.         clear\n  3.         qui set obs 10\n  4.         gen X=rnormal()\n  5.         gen Y=0+X^3\/10\n  6.         qui corr X Y, cov\n  7.         matrix results&#91;`rep',1]=r(cov_12)\n  8. }\n\n. clear\n\n. svmat results\nnumber of observations will be reset to 500\nPress any key to continue, or Break to abort\nnumber of observations (_N) was 0, now 500\n\n. summ\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\n    results1 |        500    .2975147    .2769293   .0058428   1.721521\n<\/code><\/pre>\n\n\n\n<p>I get an average sample covariance of $0.298$, which is again close to the expected $0.3$.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (d)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">set seed 1630\nmatrix results=J(500,1,.)\nforval rep=1\/500 {\n\tclear\n\tqui set obs 100\n\tgen X=rnormal()\n\tgen Y=0+X^3\/10\n\tqui corr X Y, cov\n\tmatrix results&#91;`rep',1]=r(cov_12)\n}\nclear\nsvmat results\nsumm<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. set seed 1630\n\n. matrix results=J(500,1,.)\n\n. forval rep=1\/500 {\n  2.         clear\n  3.         qui set obs 100\n  4.         gen X=rnormal()\n  5.         gen Y=0+X^3\/10\n  6.         qui corr X Y, cov\n  7.         matrix results&#91;`rep',1]=r(cov_12)\n  8. }\n\n. clear\n\n. svmat results\nnumber of observations will be reset to 500\nPress any key to continue, or Break to abort\nnumber of observations (_N) was 0, now 500\n\n. summ\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\n    results1 |        500    .3009726    .0977276   .0643365   .7326479<\/code><\/pre>\n\n\n\n<p>In both cases I get an average sample covariance close to $0.3$ ( $0.298$ with $N=10$ and $0.301$ with $N=100$) but with the larger sample the simulated $\\widehat{Cov}(X_{i},Y_{i})$ have a smaller standard deviation: of $0.098$ compared to $0.277$. I expect this standard deviation to decrease further as I increase $N$, because of the Law of Large Numbers.<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\">4 : [35 points : Empirics]<\/h3>\n\n\n\n<p>Woodbury and Spiegelman (1987; available <a href=\"https:\/\/www.jstor.org\/stable\/1814528\">here<\/a>) reports the results of two randomized experiments meant to encourage Unemployment Insurance (UI) recipients to return to work. In the Employer Experiment, an employer who employs a UI recipient for at least 4 months received a voucher worth \\$500. In the Claimant Experiment (a.k.a. the Job-Search Incentive Experiment), any UI recipient finding employment for at least 4 months received \\$500 directly.<\/p>\n\n\n\n<p>(a) [<span class=\"swl-marker mark_green\">4 points<\/span>] Load the provided IlExp.dta dataset from this study into Stata. Use the $\\texttt{describe}$ command to show a description of the variables in the dataset. Report a screenshot of the output.<\/p>\n\n\n\n<p>(b)&nbsp;[<span class=\"swl-marker mark_green\">7 points<\/span>] Use the $\\texttt{summarize}$ command to compute the means, standard deviations, etc of variables in the data. Report a screenshot of the output.<\/p>\n\n\n\n<p>(c)&nbsp;[<span class=\"swl-marker mark_green\">5 points<\/span>] Based on your previous answer and the result of the $\\texttt{count}$ command (which reports the total number of observations), which of the variables have missing data? Which variable has the most values missing, and what fraction of the total values is missing? Report a screenshot of the output used to answer these questions. How might missing data affect the interpretation of the results of the experiment?<\/p>\n\n\n\n<p>(d)&nbsp;[<span class=\"swl-marker mark_green\">8 points<\/span>] Create a new &#8220;dummy&#8221; variable that indicates whether someone had any post-claim earnings. Compute summary stats including the mean and standard deviation separately by the three treatment arms, for the following variables: total benefits paid, age, pre-claim earnings, post-claim earnings, and the dummy variable for any post-claim earnings you just created. Report a screenshot of the output. Which treatment arm has the highest post-claim earnings? Which arm has the highest fraction of people with any post-claim earnings?<\/p>\n\n\n\n<p>(e)&nbsp;[<span class=\"swl-marker mark_green\">6 points<\/span>] Write a few sentences about how economic reasoning might explain the differences in earnings described above across the treatment arms.<\/p>\n\n\n\n<p>(f)&nbsp;[<span class=\"swl-marker mark_green\">5 points<\/span>] Submit clean and well-commented code used for this question.<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (a)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">use IlExp, clear\ndescribe<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. use IlExp, clear\n\n. describe\n\nContains data from IlExp.dta\n  obs:        12,101                          \n vars:            17                          10 Jan 2014 17:52\n size:       822,868                          \n------------------------------------------------------------------------------------------------------------\n              storage   display    value\nvariable name   type    format     label      variable label\n------------------------------------------------------------------------------------------------------------\nage             float   %9.0g                 claimant age\nbenpdbye        float   %9.0g                 benefits paid, full benefit year\nblack           float   %9.0g                 claimant is black\ncontrol         float   %9.0g                 control group\nexstbeny        float   %9.0g                 exhausted benefits (benefit year)\nhie             float   %9.0g                 hiring incentive experiment group\nhispanic        float   %9.0g                 claimant is hispanic\njsie            float   %9.0g                 job search incentive experiment group\nmale            float   %9.0g                 claimant is male\nnatvamer        float   %9.0g                 claimant is native american\notherace        float   %9.0g                 claimant is of other race\npospearn        float   %9.0g                 claimant post-claim earnings\nprepearn        float   %9.0g                 claimant pre-claim earnings\nwhite           float   %9.0g                 claimant is white\nwkspdbye        float   %9.0g                 weeks of benefits, benefit year\ntreat           float   %9.0g                 \njsipart         float   %9.0g                 claimant participated in jsi (artificial data created in 2014)\n------------------------------------------------------------------------------------------------------------\nSorted by: \n<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (b)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">summarize<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. summarize\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\n         age |     12,101    33.00083    8.926023         20         54\n    benpdbye |     12,101     2698.75    2083.071          0       8151\n       black |     12,101    .2591521    .4381874          0          1\n     control |     12,101    .3265846    .4689832          0          1\n    exstbeny |     12,101    .4564912     .498124          0          1\n-------------+---------------------------------------------------------\n         hie |     12,101    .3274936    .4693184          0          1\n    hispanic |     12,101    .0754483    .2641243          0          1\n        jsie |     12,101    .3459218    .4756875          0          1\n        male |     12,101    .5495414    .4975602          0          1\n    natvamer |     12,101    .0074374    .0859226          0          1\n-------------+---------------------------------------------------------\n    otherace |     12,101    .0146269    .1200589          0          1\n    pospearn |     11,861    1749.021    2233.563          0      66466\n    prepearn |     11,862     3631.45    2709.897          0      55000\n       white |     12,101    .6433353    .4790344          0          1\n    wkspdbye |     12,101    19.54326    12.19206          0         48\n-------------+---------------------------------------------------------\n       treat |     12,101    .6734154    .4689832          0          1\n     jsipart |     12,101    .2914635    .4544553          0          1<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (c)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>From the count command or the screenshot in (a), we see that the total number of observations is $12,101$. So only the earnings variables ($\\emph{pospearn}$ and $\\emph{prepearn}$) have missing values. The post-earnings variable $\\emph{pospearn}$ has the most missing: the data is non-missing in 11,861\/12,101 of cases, so about 2% are missing. Missing values of earnings could be important because we care about earnings differences across treatment arms, but we may only have a selected sample of earnings. However, since only 2% of earnings are missing, we hope that this selection bias will be small.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (d)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">gen anypostearnings=pospearn>0\nreplace anypostearnings=. if pospearn==.\nsumm benpdbye age prepearn pospearn anypostearnings if control == 1\nsumm benpdbye age prepearn pospearn anypostearnings if hie == 1\nsumm benpdbye age prepearn pospearn anypostearnings if jsie == 1<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. gen anypostearnings=pospearn>0\n\n. replace anypostearnings=. if pospearn==.\n(240 real changes made, 240 to missing)\n\n. summ benpdbye age prepearn pospearn anypostearnings if control == 1\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\n    benpdbye |      3,952    2785.891    2096.248          0       8073\n         age |      3,952     32.9795      8.8693         20         54\n    prepearn |      3,866    3640.385      2700.1          0      55000\n    pospearn |      3,866    1692.786    2036.887          0      15664\nanypostear~s |      3,866    .7956544    .4032748          0          1\n\n. summ benpdbye age prepearn pospearn anypostearnings if hie == 1\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\n    benpdbye |      3,963    2724.943    2094.621          0       8151\n         age |      3,963    33.09866    9.052213         20         54\n    prepearn |      3,878    3622.949    2648.758          0      34462\n    pospearn |      3,878    1731.958    2113.525          0      23621\nanypostear~s |      3,878    .7880351    .4087528          0          1\n\n. summ benpdbye age prepearn pospearn anypostearnings if jsie == 1\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\n    benpdbye |      4,186    2591.682    2055.308          0       8151\n         age |      4,186    32.92833    8.860157         20         54\n    prepearn |      4,118    3631.068    2775.832          0      50260\n    pospearn |      4,117    1817.899    2502.684          0      66466\nanypostear~s |      4,117     .802769    .3979565          0          1<\/code><\/pre>\n\n\n\n<p>Individuals in the job-search incentive group have the highest post-claim earnings and the highest rate of any post-period earnings. Differences in pre-claim earnings are much smaller across the groups than differences in post-claim earnings.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (e)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The job-search incentive treatment arm provided additional incentives for people to work, and so we might expect people to search harder under this treatment and thus have higher earnings. We might have also expected the employer-benefit incentive to make workers more desirable to hire and thus increase earnings as well. At least based on the means, this experiment does not appear to have been as effective, however. The fact that pre-claim earnings are similar across groups speaks to the success of the randomization protocol.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (f)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Homework1.do<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">* Part (a)\nuse IlExp, clear\ndesc\n* Part (b)\nsumm\n* Part (d)\ngen anypostearnings=pospearn>0\nreplace anypostearnings=. if pospearn==.\nsumm benpdbye age prepearn pospearn anypostearnings if control == 1\nsumm benpdbye age prepearn pospearn anypostearnings if hie == 1\nsumm benpdbye age prepearn pospearn anypostearnings if jsie == 1<\/code><\/pre>\n<\/div><\/details>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Homework 2<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">1 : [30 points : Theory]<\/h3>\n\n\n\n<p>Suppose we are interested in whether workers are less productive on days when there is more air pollution. We are lucky enough to have identified a sample of days $i$ where pollution $X_{i}^{*}$ is plausibly as-good-as-randomly assigned with respect to latent worker productivity, and we think the linear model<\/p>\n\n\n\n<p>\\begin{align} Y_{i} &amp; =\\mu+\\tau X_{i}^{*}+\\epsilon_{i} \\end{align} gives the causal effect $\\tau$ on average worker productivity $Y_{i}$. Unfortunately, we do not measure pollution directly. Instead, we observe a noisy measure \\begin{align} X_{i} &amp; =X_{i}^{*}+\\nu_{i} \\end{align} We assume the &#8221;measurement error&#8221; $\\nu_{i}$ is idiosyncratic, in the sense of $Cov(\\nu_{i},X_{i}^{*})=Cov(\\nu_{i},\\epsilon_{i})=0$, and that it is mean zero: $E[\\nu_{i}]=0$.<\/p>\n\n\n\n<p>(a) [<span class=\"swl-marker mark_green\">5 points<\/span>] Write down the formula for the slope coefficient from the bivariate population regression of $Y_{i}$ on $X_{i}^{*}$. Plug the model (1) into this formula, and simplify to show that this coefficient identifies $\\tau$ if and only if $Cov(X_{i}^{*},\\epsilon_{i})=0$ [this is how we&#8217;ll formalize &#8221;as-good-as-random assignment&#8221; here].<\/p>\n\n\n\n<p>(b) [<span class=\"swl-marker mark_green\">9 points<\/span>] Suppose $Cov(X_{i}^{*},\\epsilon_{i})=0$. Write down the formula for the slope coefficient from the bivariate population regression of $Y_{i}$ on $X_{i}$. Plug the model (1) and the measurement equation (2) into this formula and simplify to show that as-good-as-random assignment is not enough to identify $\\tau$ when the regressor is measured with error.<\/p>\n\n\n\n<p>(c) [<span class=\"swl-marker mark_green\">7 points<\/span>] How does the sign of the slope coefficient in (b) compare to $\\tau$? How do their magnitudes compare? If we were to reject the null hypothesis of an insignificant slope coefficient, could we feel confident that $\\tau\\neq0$?<\/p>\n\n\n\n<p>(d) [<span class=\"swl-marker mark_green\">9 points<\/span>] Now suppose we fix our pollution measurement device so we record $X_{i}^{*}$ in our data without error. However, we discovered a bug in our code generating the average worker productivity measure. Rather than $Y_{i}$, we are actually only able to observe a noisy outcome $\\tilde{Y}_{i}=Y_{i}+\\eta_{i}$ where we again assume idiosyncratic noise, $E[\\eta_{i}]=Cov(\\eta_{i},X_{i}^{*})=Cov(\\eta_{i},\\epsilon_{i})=0$. Write down the formula for the slope coefficient from the bivariate population regression of $\\tilde{Y}_{i}$ on $X_{i}^{*}$. Plug the model and the new measurement equation into this formula and simplify to show that the coefficient identifies $\\tau$ when $X_{i}^{*}$ is as-good-as-randomly assigned. Show, in other words, that measurement error &#8221;on the left&#8221; does not introduce bias (unlike measurement error &#8221;on the right,&#8221; as you showed in (b)).<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (a)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The slope coefficient is given by\\begin{align*}\\beta^{*} &amp; =\\frac{Cov(Y_{i},X_{i}^{*})}{Var(X_{i}^{*})}\\\\ &amp; =\\frac{Cov(\\mu+\\tau X_{i}^{*}+\\epsilon_{i},X_{i}^{*})}{Var(X_{i}^{*})}\\\\ &amp; =\\frac{Cov(\\mu,X_{i}^{*})+\\tau Cov(X_{i}^{*},X_{i}^{*})+Cov(\\epsilon_{i},X_{i}^{*})}{Var(X_{i}^{*})}\\\\ &amp; =\\tau+\\frac{Cov(\\epsilon_{i},X_{i}^{*})}{Var(X_{i}^{*})}\\end{align*}where we plug the model in for the second equality, use linearity for the third equality, and use the facts that $Cov(\\mu,X_{i}^{*})=0$ and $Cov(X_{i}^{*},X_{i}^{*})=Var(X_{i}^{*})$ for the fourth equality. This shows $\\beta^{*}=\\tau$ if and only if $Cov(\\epsilon_{i},X_{i}^{*})=0$.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (b)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The slope coefficient is given by\\begin{align*}\\beta &amp; =\\frac{Cov(Y_{i},X_{i})}{Var(X_{i})}\\\\ &amp; =\\frac{Cov(\\mu+\\tau X_{i}^{*}+\\epsilon_{i},X_{i}^{*}+\\nu_{i})}{Var(X_{i}^{*}+\\nu_{i})}\\\\ &amp; =\\frac{Cov(\\mu,X_{i}^{*})+\\tau Cov(X_{i}^{*},X_{i}^{*})+Cov(\\epsilon_{i},X_{i}^{*})+Cov(\\mu,\\nu_{i})+\\tau Cov(X_{i}^{*},\\nu_{i})+Cov(\\epsilon_{i},\\nu_{i})}{Var(X_{i}^{*})+Var(\\nu_{i})}\\\\ &amp; =\\tau\\frac{Var(X_{i}^{*})}{Var(X_{i}^{*})+Var(\\nu_{i})}\\end{align*}where we plug both the model and the measurement equation in for the second equality, use linearity for the third equality, and use the given facts to arrive at the fourth equality. This shows $\\beta\\neq\\tau$ generally; with $Var(X_{i}^{*})&gt;0$ and $Var(\\nu_{i})&gt;0$ we have $\\beta=\\tau\\kappa$ for $\\kappa\\in(0,1)$.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (c)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The above formula shows that $\\beta$ and $\\tau$ have the same sign, but that the former estimand is \\emph onattenuated \\emph defaultrelative to the latter parameter. That is, $|\\beta|&lt;|\\tau|$. Thus if we can reject the null hypothesis of $\\beta=0$ we can feel confident that $\\tau\\neq0$ as well, though we don&#8217;t know how much bigger it is (in absolute value) than $\\beta$.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (d)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>We now have \\begin{align*}\\tilde{\\beta} &amp; =\\frac{Cov(\\tilde{Y}_{i},X_{i}^{*})}{Var(X_{i}^{*})}\\\\ &amp; =\\frac{Cov(\\mu+\\tau X_{i}^{*}+\\epsilon_{i}+\\eta_{i},X_{i}^{*})}{Var(X_{i}^{*})}\\\\ &amp; =\\frac{Cov(\\mu,X_{i}^{*})+\\tau Cov(X^{*},X_{i}^{*})+Cov(\\epsilon_{i},X_{i}^{*})+Cov(\\eta_{i},X_{i}^{*})}{Var(X_{i}^{*})}\\\\ &amp; =\\tau\\end{align*}So the causal parameter $\\tau$ is indeed identified by the regression slope $\\beta$ in this case.<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\">2 : [25 points : Theory]<\/h3>\n\n\n\n<p>In class we showed that the slope coefficient $\\widehat{\\beta}$ in a bivariate OLS regression has the asymptotic distribution of:<\/p>\n\n\n\n<p>\\begin{align*}\\sqrt{N}(\\hat{\\beta}-\\beta) &amp; \\rightarrow_{d}\\mathrm{N}(0,\\sigma^{2})\\end{align*}<\/p>\n\n\n\n<p>where \\begin{align}\\sigma^{2} &amp; =\\dfrac{Var((X_{i}-E[X_{i}])\\epsilon_{i})}{Var(X_{i})^{2}}\\end{align} for $\\epsilon_{i}=Y_{i}-(\\alpha+X_{i}\\beta)$ with $\\alpha$ and $\\beta$ being the coefficients in the population bivariate regression of $Y_{i}$on~$X_{i}$. This question will teach you about homoskedasticity and heteroskedasticity. By definition, $\\epsilon_{i}$ is $\\emph{homoskedastic}$ if $Var(\\epsilon_{i}|X_{i}=x)=\\omega^{2}$ for all $x$; that is, when the conditional variance of $\\epsilon_{i}$ given $X_{i}$ doesn&#8217;t depend on $X_{i}$. Otherwise, $\\epsilon_{i}$ is said to be $\\emph{heteroskedastic}$.<\/p>\n\n\n\n<p>(a) [<span class=\"swl-marker mark_green\">6 points<\/span>] Show that if $\\epsilon_{i}$ is homoskedastic, then $Var(Y_{i}|X_{i}=x)$ doesn&#8217;t depend on $x$. [Hint: remember that $Var[a+Y]=Var[Y]$, and when we have conditional expectations\/variances we can treat functions of $X_{i}$ like constants]<\/p>\n\n\n\n<p>(b) [<span class=\"swl-marker mark_green\">6 points<\/span>] Say $Y_{i}$ is earnings and $X_{i}$ is an indicator for college attainment. In light of the fact that we showed in the previous question, what would homoskedasticity imply about the variance of earnings for college and non-college workers? Do you think this is likely to hold in practice?<\/p>\n\n\n\n<p>(c) [<span class=\"swl-marker mark_green\">9 points<\/span>] Show that if $\\epsilon_{i}$ is homoscedastic and $E[\\epsilon_{i}|X_{i}]=0$ (as occurs when the CEF is linear), then $\\sigma^{2}=\\frac{\\omega^{2}}{Var(X_{i})}$. [Hint: you may use the fact that $E[\\epsilon_{i}]=E[X_{i}\\epsilon_{i}]=0$, which we derived in class.]<\/p>\n\n\n\n<p>(d) [<span class=\"swl-marker mark_green\">4 points<\/span>] Due to some unfortunate historical circumstances, the default regression command in Stata (and R) reports standard errors based on the assumption of homoskedasticity, following the formula you derived in part (c). There is essentially no good reason to use standard errors assuming homoskedasticity. If you type &#8221;reg y x, robust&#8221;, then Stata gives you standard errors based on the formula (3); these are sometimes called heteroskedasticity-robust standard errors. You should always remember to type the &#8221;, robust&#8221; option in Stata (this can be abbreviated to &#8221;, r&#8221;)<span class=\"swl-marker mark_blue\">$^1$<\/span>. Please write the sentence, &#8221;I will not forget to use the &#8216;, r&#8217; option for robust standard errors&#8221; five times. [This is not a trick question &#8212; I just really want you to remember this!]<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><em><span class=\"swl-marker mark_blue\">Even very smart people like <a href=\"https:\/\/twitter.com\/instrumenthull\/status\/1274901353563934726\">Nate Silver<\/a> forget to do this sometimes.<\/span><\/em><\/li>\n<\/ol>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (a)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Recall that $\\epsilon_{i}=Y_{i}-\\alpha-X_{i}\\beta$. Hence $Var(\\epsilon_{i}\\mid X_{i})=Var(Y_{i}-\\alpha-X_{i}\\beta\\mid X_{i})=Var(Y_{i}\\mid X_{i})$. This means if $Var(\\epsilon_{i}\\mid X_{i})$ doesn&#8217;t depend on $X_{i}$, neither does $Var(Y_{i}\\mid X_{i})$.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (b)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Homoskedasticity would imply that the variance of earnings is the same for college-educated and non-college educated workers. This seems unlikely to hold in practice. For instance, the distribution of earnings for college earnings has a much longer right tail and likely has higher variance.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (c)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>We showed in class that $E[\\epsilon_{i}]=E[X_{i}\\epsilon_{i}]=0$. This implies that $E[(X_{i}-E[X_{i}])\\epsilon_{i}]=E[X_{i}\\epsilon_{i}]-E[X_{i}]E[\\epsilon_{i}]=0$. Hence, $Var((X_{i}-E[X_{i}])\\epsilon_{i})=E[(X_{i}-E[X_{i}])^{2}\\epsilon_{i}^{2}]$. We then see that \\begin{align*} Var((X_{i}-E[X_{i}])\\epsilon_{i}) &amp; =E[(X_{i}-E[X_{i}])^{2}\\epsilon_{i}^{2}]\\\\ &amp; =E[E[(X_{i}-E[X_{i}])^{2}\\epsilon_{i}^{2}|X_{i}]]\\text{ (Law of iterated expectation) }\\\\ &amp; =E[(X_{i}-E[X_{i}])^{2}E[\\epsilon_{i}^{2}|X_{i}]]\\\\ &amp; =E[(X_{i}-E[X_{i}])^{2}Var[\\epsilon_{i}|X_{i}]]\\text{ (Since }Var[\\epsilon_{i}|X_{i}]=E[\\epsilon_{i}^{2}|X_{i}]-E[\\epsilon_{i}|X_{i}]^{2}=E[\\epsilon_{i}^{2}|X_{i}]\\text{) }\\\\ &amp; =E[(X_{i}-E[X_{i}])^{2}]\\omega^{2}\\text{ (Since }Var[\\epsilon_{i}|X_{i}]=\\omega^{2}\\text{ by assumption )}\\\\ &amp; =Var(X_{i})\\omega^{2}\\end{align*}<\/p>\n\n\n\n<p> Plugging into the formula for $\\sigma^{2}$, we obtain $\\omega^{2}\/Var(X_{i})$ as desired.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (d)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>I will not forget to use the `, r&#8217; option for robust standard errors. <br>I will not forget to use the `, r&#8217; option for robust standard errors.<br>I will not forget to use the `, r&#8217; option for robust standard errors.<br>I will not forget to use the `, r&#8217; option for robust standard errors.<br>I will not forget to use the `, r&#8217; option for robust standard errors.<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\">3 : [45 points : Empirics]<\/h3>\n\n\n\n<p>Let&#8217;s once again use the Woodbury and Spiegelman (1987) data, now with some regression.<\/p>\n\n\n\n<p>(a) [<span class=\"swl-marker mark_green\">7 points<\/span>] Restrict your analysis to the job-search incentive and the control group. Regress postclaim earnings on a constant and an indicator for being in the job-search incentive group (don\u2019t forget your answer to 2(d) above!). Report a screenshot of your results.<\/p>\n\n\n\n<p>(b) [<span class=\"swl-marker mark_green\">5 points<\/span>] How does the intercept estimate from your regression in part (a) compare to your estimate of the control group mean from the previous problem set? What about its confidence interval?<\/p>\n\n\n\n<p>(c) [<span class=\"swl-marker mark_green\">5 points<\/span>] How does the estimated coefficient on being in the job-search group from your regression in part (a) compare to your estimate of the treatment effect from the previous problem set (i.e. the difference in post earnings across treatment and control groups)? What about its confidence interval?<\/p>\n\n\n\n<p>(d) [<span class=\"swl-marker mark_green\">7 points<\/span>] Re-run the regression in part (a) but without using the \u2018, robust\u2019 option (never do this again!). Report a screenshot of your results. Discuss any changes in coefficients and standard errors.<\/p>\n\n\n\n<p>(e) [<span class=\"swl-marker mark_green\">7 points<\/span>] Re-run the regression in part (a) but with the &#8221;black&#8221; indicator included as a control. Report a screenshot of your results. Explain intuitively why it makes sense that the slope coefficient doesn&#8217;t really change with this control [hint: remember we are analyzing an experiment].<\/p>\n\n\n\n<p>(f) [<span class=\"swl-marker mark_green\">9 points<\/span>] Re-run the regression in part (e) but including an interaction} variable which multiplies the &#8221;black&#8221; indicator with the job-search incentive treatment indicator. Report a screenshot of your results. What is the regression estimate of the treatment effect for non-black individuals? What is the regression estimate of the treatment effect for black individuals? Is the difference in estimated effects statistically significant?<\/p>\n\n\n\n<p>(g) [<span class=\"swl-marker mark_green\">5 points<\/span>] Submit clean and well-commented code used for this question.<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (a)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">use IlExp.dta, clear\ngen touse = inlist(1, control, jsie)\nreg pospearn jsie if touse, r <\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. use IlExp.dta, clear\n\n. gen touse = inlist(1, control, jsie)\n\n. reg pospearn jsie if touse, r\n\nLinear regression                               Number of obs     =      7,983\n                                                F(1, 7981)        =       6.03\n                                                Prob > F          =     0.0141\n                                                R-squared         =     0.0007\n                                                Root MSE          =       2289\n\n------------------------------------------------------------------------------\n             |               Robust\n    pospearn |      Coef.   Std. Err.      t    P>|t|     &#91;95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        jsie |   125.1129   50.93661     2.46   0.014     25.26381    224.9619\n       _cons |   1692.786   32.75927    51.67   0.000     1628.569    1757.003\n------------------------------------------------------------------------------<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (b)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The intercept coincides perfectly with the estimated mean of the control group. Standard errors (and hence confidence intervals) are almost identical.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (c)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Again, the estimated coefficient coincides exactly with the treatment effect estimated in PS2. Standard errors (and hence confidence intervals) are almost identical.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (d)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">reg pospearn jsie if touse<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. reg pospearn jsie if touse\n\n      Source |       SS           df       MS      Number of obs   =     7,983\n-------------+----------------------------------   F(1, 7981)      =      5.96\n       Model |  31209051.7         1  31209051.7   Prob > F        =    0.0147\n    Residual |  4.1816e+10     7,981  5239417.04   R-squared       =    0.0007\n-------------+----------------------------------   Adj R-squared   =    0.0006\n       Total |  4.1847e+10     7,982  5242670.57   Root MSE        =      2289\n\n------------------------------------------------------------------------------\n    pospearn |      Coef.   Std. Err.      t    P>|t|     &#91;95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        jsie |   125.1129    51.2629     2.44   0.015     24.62419    225.6016\n       _cons |   1692.786   36.81379    45.98   0.000     1620.621    1764.951\n------------------------------------------------------------------------------<\/code><\/pre>\n\n\n\n<p>The coefficients are identical, as expected, but now the standard errors are different (they are no longer robust but instead calculated by the homoskedastic formula above). Somewhat surprisingly here, the homoskedastic standard errors are a bit larger than the heteroskedastic ones (we usually expect the opposite).<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (e)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">reg pospearn jsie black if touse, r<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. reg pospearn jsie black if touse, r\nLinear regression                               Number of obs     =      7,983\n                                                F(2, 7980)        =      54.95\n                                                Prob &gt; F          =     0.0000\n                                                R-squared         =     0.0103\n                                                Root MSE          =     2278.2\n\n------------------------------------------------------------------------------\n             |               Robust\n    pospearn |      Coef.   Std. Err.      t    P&gt;|t|     &#91;95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        jsie |    115.156   50.59379     2.28   0.023     15.97893     214.333\n       black |  -511.5598   49.17525   -10.40   0.000    -607.9561   -415.1634\n       _cons |   1829.608   36.83119    49.68   0.000     1757.409    1901.807\n------------------------------------------------------------------------------\n<\/code><\/pre>\n\n\n\n<p>The randomized treatment variable should be uncorrelated with all predetermined characteristics of individuals (just as we expect it to be uncorrelated with potential outcomes). Thus none of these characteristics are a source of bias, since adding them to the simple treatment regression has no effect on the estimated coefficient.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (f)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">gen jsie_black=jsie*black\nreg pospearn jsie jsie_black black if touse, r<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. gen jsie_black=jsie*black\n\n. reg pospearn jsie jsie_black black if touse, r\n\nLinear regression                               Number of obs     =      7,983\n                                                F(3, 7979)        =      37.18\n                                                Prob > F          =     0.0000\n                                                R-squared         =     0.0103\n                                                Root MSE          =     2278.3\n\n------------------------------------------------------------------------------\n             |               Robust\n    pospearn |      Coef.   Std. Err.      t    P>|t|     &#91;95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        jsie |   123.2174   62.92546     1.96   0.050    -.1329608    246.5677\n  jsie_black |  -31.27074   98.28245    -0.32   0.750      -223.93    161.3885\n       black |  -495.8183   65.71593    -7.54   0.000    -624.6387   -366.9979\n       _cons |   1825.398    40.2122    45.39   0.000     1746.571    1904.224\n------------------------------------------------------------------------------<\/code><\/pre>\n\n\n\n<p>The regression estimate of the treatment effect for non-black individuals is given by the treatment main effect (at 123.2) since this approximates the effect of the treatment on the outcome when the black indicator is zero. The regression estimate of the treatment effect for black individuals is given by the sum of this main effect and the interaction effect (so 91.9=123.2-31.3) since this approximates the effect of the treatment on the outcome when the black indicator is one. The interaction effect thus gives the difference in estimated effects. With a p-value of 0.75, it is far from statistically significant.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (g)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Homework2.do<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">* Part (a)\nuse IlExp.dta, clear\ngen touse = inlist(1, control, jsie)\nreg pospearn jsie if touse, r\n* Part (d)\nreg pospearn jsie if touse\n* Part (e)\nreg pospearn jsie black if touse, r\n* Part (f)\ngen jsie_black=jsie*black\nreg pospearn jsie jsie_black black if touse, r<\/code><\/pre>\n<\/div><\/details>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Homework 3<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">1 : [32 points : Theory]<\/h3>\n\n\n\n<p>You observe an $\\emph{iid }$sample of data $(Y_{i},L_{i},K_{i})$ across a set of manufacturing firms $i$. Here $Y_{i}$ denotes the output (e.g. total sales) of the firm in some period, $L_{i}$ measures the labor input (e.g. total wage bill) of the firm in this period, and $K_{i}$ measures the capital input (e.g. total value of machines and other assets) of the firm in this period. We are interested in estimating a $\\emph{production function}$: i.e. the structural relationship $\\emph{determining}$ a firm&#8217;s ability to produce output given a set of inputs.<\/p>\n\n\n\n<p>(a) [<span class=\"swl-marker mark_green\">6 points<\/span>] Suppose you estimate a regression of $\\ln Y_{i}$ on $\\ln L_{i}$ and $\\ln K_{i}$ (and a constant), where $\\ln$ denotes the natural log. Explain how you would interpret the estimated coefficients on $\\ln L_{i}$ and $\\ln K_{i}$, without making any assumptions on the structural relationship.<\/p>\n\n\n\n<p>(b) [<span class=\"swl-marker mark_green\">8 points<\/span>] Now suppose you assume a Cobb-Douglas production function: $Y_{i}=Q_{i}L_{i}^{\\alpha}K_{i}^{\\beta}$ for some parameters $(\\alpha,\\beta)$, where $Q_{i}$ denotes the (unobserved) productivity of firm $i$. Suppose we assume productivity shocks are as-good-as-random across firms: i.e. that $Q_{i}$ is independent of $(L_{i},K_{i})$. Show that under this assumption the regression estimated in (a) identifies $\\alpha$ and $\\beta$.<\/p>\n\n\n\n<p>(c) [<span class=\"swl-marker mark_green\">8 points<\/span>] Suppose we further assume constant returns-to-scale: $\\alpha+\\beta=1$. Show that a bivariate regression of $\\ln(Y_{i}\/L_{i})$ on $\\ln(K_{i}\/L_{i})$ (and a constant) identifies the production function parameters, maintaining the independence assumption in (b). How could we test the constant-returns-to-scale assumption here?<\/p>\n\n\n\n<p>(d) [<span class=\"swl-marker mark_green\">10 points<\/span>] Let&#8217;s now weaken the as-good-as-random assignment assumption in (b). Suppose we model $Q_{i}=S_{i}^{\\theta}\\epsilon_{i}$ where $S_{i}$ denotes the observed size of firm $i$, $\\theta$ is a parameter governing the relationship between firm size and productivity, and $\\epsilon_{i}$ is a productivity shock that is independent of $(S_{i},L_{i},K_{i})$. Specify a regression which identifies $\\beta$ and $\\theta$ under this assumption, maintaining the assumption of $\\alpha+\\beta=1$. Do you expect the regression estimated in (c) to overstate or understate $\\beta$, given the new model?<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (a)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The regression \\begin{align*}\\ln Y_{i} &amp; =\\gamma_{0}+\\gamma_{1}\\ln L_{i}+\\gamma_{2}\\ln K_{i}+U_{i}\\end{align*}gives a linear approximation of the CEF &nbsp;$E[\\ln Y_{i}\\mid\\ln L_{i},\\ln K_{i}]$ absent any assumptions on the structural production function. We can interpret $\\gamma_{1}$ as the approximate partial derivative of this CEF with respect to &nbsp;$\\ln L_{i}$ and &nbsp;$\\gamma_{2}$ as the approximate partial derivative with respect to &nbsp;$\\ln K_{i}$. As discussed in class, these parameters have the interpretation of an elasticity: &nbsp;$\\gamma_{1}$ approximates the percentage change in output per percentage increase in labor across firms (holding capital fixed), while &nbsp;$\\gamma_{2}$ approximates the percentage change in output per percentage increase in capital across firms (holding labor fixed).<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (b)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Under the Cobb-Douglas model, \\begin{align*}\\ln Y_{i} &amp; =\\ln(Q_{i}L_{i}^{\\alpha}K_{i}^{\\beta})\\\\ &amp; =\\ln Q_{i}+\\alpha\\ln L_{i}+\\beta\\ln K_{i}.\\end{align*}If &nbsp;$Q_{i}$ is independent of &nbsp;$(L_{i},K_{i})$, then &nbsp;$\\ln Q_{i}$ is independent of &nbsp;$\\ln L_{i}$ and &nbsp;$\\ln K_{i}$. In particular, the conditional expectation \\begin{align*}E[\\ln Y_{i}\\mid\\ln L_{i},\\ln K_{i}] &amp; =E[\\ln Q_{i}\\mid\\ln L_{i},\\ln K_{i}]+\\alpha\\ln L_{i}+\\beta\\ln K_{i}\\\\ &amp; =E[\\ln Q_{i}]+\\alpha\\ln L_{i}+\\beta\\ln K_{i}\\end{align*}is linear in &nbsp;$\\ln L_{i}$ and &nbsp;$\\ln K_{i}$. This means that the regression in (a) identifies &nbsp;$\\alpha$ and &nbsp;$\\beta$ as the coefficients of this regression under this model and assumption.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (c)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>If we assume &nbsp;$\\alpha+\\beta=1$ then &nbsp;$\\alpha=1-\\beta$ and our model becomes \\begin{align*}\\ln Y_{i} &amp; =\\ln Q_{i}+(1-\\beta)\\ln L_{i}+\\beta\\ln K_{i}\\\\ &amp; =\\ln Q_{i}+\\ln L_{i}+\\beta(\\ln K_{i}-\\ln L_{i})\\end{align*}Since &nbsp;$\\ln(Y_{i}\/L_{i})=\\ln Y_{i}-\\ln L_{i}$, this means \\begin{align*}E[\\ln(Y_{i}\/L_{i})\\mid\\ln L_{i},\\ln K_{i}] &amp; =E[\\ln Y_{i}\\mid\\ln L_{i},\\ln K_{i}]-\\ln L_{i}\\\\ &amp; =E[\\ln Q_{i}]+\\beta(\\ln K_{i}-\\ln L_{i}).\\end{align*}So, as before, the conditional expectation &nbsp;$E[\\ln(Y_{i}\/L_{i})\\mid\\ln L_{i},\\ln K_{i}]$ is linear in &nbsp;$\\ln K_{i}-\\ln L_{i}=\\ln(K_{i}\/L_{i})$. This means the slope coefficient in a bivariate regression of &nbsp;$\\ln(Y_{i}\/L_{i})$ on &nbsp;$\\ln(K_{i}\/L_{i})$ identifies &nbsp;$\\beta$, and since we know &nbsp;$\\alpha=1-\\beta$ this parameter is also identified. To test constant returns-to-scale we could regress &nbsp;$\\ln Y_{i}$ on &nbsp;$\\ln L_{i}$ and &nbsp;$\\ln K_{i}$ and use the lincom command in stata to check whether the sum of their coefficients is one.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (d)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The model is now &nbsp;$Y_{i}=L_{i}^{1-\\beta}K_{i}^{\\beta}S_{i}^{\\theta}\\epsilon_{i}$, implying \\begin{align*}\\ln Y_{i} &amp; =\\ln(L_{i}^{1-\\beta}K_{i}^{\\beta}S_{i}^{\\theta}\\epsilon_{i})\\\\ &amp; =(1-\\beta)\\ln L_{i}+\\beta\\ln K_{i}+\\theta\\ln S_{i}+\\ln\\epsilon_{i}\\\\\\ln Y_{i}-\\ln L_{i} &amp; =\\beta(\\ln K_{i}-\\ln L_{i})+\\theta\\ln S_{i}+\\ln\\epsilon_{i}\\\\\\ln(Y_{i}\/L_{i}) &amp; =\\beta\\ln(K_{i}\/L_{i})+\\theta\\ln S_{i}+\\ln\\epsilon_{i}\\end{align*}Similar to before, we have \\begin{align*}E[\\ln(Y_{i}\/L_{i})\\mid\\ln L_{i},\\ln K_{i},\\ln S_{i}] &amp; =\\beta\\left(\\ln(K_{i}\/L_{i})\\right)+\\theta\\ln S_{i}+E[\\ln\\epsilon_{i}\\mid\\ln L_{i},\\ln K_{i},\\ln S_{i}]\\\\ &amp; =E[\\ln\\epsilon_{i}]+\\beta\\ln(K_{i}\/L_{i})+\\theta\\ln S_{i}\\end{align*}using the independence of &nbsp;$\\epsilon_{i}$ from &nbsp;$(S_{i},L_{i},K_{i})$, which implies the independence of &nbsp;$\\ln\\epsilon_{i}$ from &nbsp;$(\\ln S_{i},\\ln L_{i},\\ln K_{i})$. This means that a regression of log output\/labor on log capital\/labor and log firm size identifies the production function parameters &nbsp;$(\\beta,\\theta)$. The regression model which omits log firm size will generally be \\begin_inset Quotes eldbiased\\begin_inset Quotes erd (in the sense of an identification failure, not the statistical sense). Specifically, it will identify \\begin{align*}\\frac{Cov\\left(\\ln(Y_{i}\/L_{i}),\\ln(K_{i}\/L_{i})\\right)}{Var\\left(\\ln(K_{i}\/L_{i})\\right)} &amp; =\\frac{Cov\\left(\\beta\\ln(K_{i}\/L_{i})+\\theta\\ln S_{i}+\\ln\\epsilon_{i},\\ln(K_{i}\/L_{i})\\right)}{Var\\left(\\ln(K_{i}\/L_{i})\\right)}\\\\ &amp; =\\beta+\\theta\\frac{Cov\\left(\\ln S_{i},\\ln(K_{i}\/L_{i})\\right)}{Var\\left(\\ln(K_{i}\/L_{i})\\right)}\\end{align*}I would expect &nbsp;$\\theta&gt;0$, i.e. for larger firms to be more productive holding capital and labor fixed. I have less of a solid sense of the sign of &nbsp;$Cov\\left(\\ln S_{i},\\ln(K_{i}\/L_{i})\\right)$, but one might imagine that more capital-intensive firms are larger because they have more ability to pay the fixed costs to invest in things like fancy machinery or buildings. In this case &nbsp;$Cov\\left(\\ln S_{i},\\ln(K_{i}\/L_{i})\\right)&gt;0$ and so the regression in (c) will generally overstate &nbsp;$\\beta$. If you told a story for why &nbsp;$Cov\\left(\\ln S_{i},\\ln(K_{i}\/L_{i})\\right)&lt;0$ then you might conclude that there is a downward bias in (c).<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\">2 : [32 points : Theory]<\/h3>\n\n\n\n<p>Suppose we are interested in estimating the (potentially different) employment effects of minimum wage increases for high school dropouts and high school graduates. As in Card and Krueger (1994), we observe employment outcomes for a sample of individuals of both educational groups in New Jersey and Pennslyvania, before and after the New Jersey minimum wage increase. Let $Y_{it}$ denote the employment status of individual $i$ at time $t$, let $D_{i}\\in\\{0,1\\}$ indicate an individual&#8217;s residence in New Jersey (asuming nobody moves between the two time periods), and let $Post_{t}\\in\\{0,1\\}$ indicate the latter time period. Furthermore let $Grad_{i}\\in\\{0,1\\}$ indicate high school graduation. Consider the regression of \\begin{align}Y_{it}= &amp; \\mu+\\alpha D_{i}+\\tau Post_{t}+\\gamma Grad_{i}+\\beta D_{i}Post_{t}\\\\ &amp; +\\lambda Post_{t}Grad_{i}+\\psi D_{i}Grad_{i}+\\pi D_{i}Post_{t}Grad_{i}+\\upsilon_{it}.\\nonumber \\end{align}<\/p>\n\n\n\n<p>Note in that this regression includes all \u2018\u2018main effects&#8221; ($D_{i}$, $Post_{t}$, and $Grad_{i}$), all two-way interactions ($D_{i}Post_{t}$, $Post_{t}Grad_{i}$, and $D_{i}Grad_{i}$) as well as the three-way interaction $D_{i}Post_{t}Grad_{i}$.<\/p>\n\n\n\n<p>(a) [<span class=\"swl-marker mark_green\">7 Points<\/span>] Suppose we regress $Y_{it}$ on $D_{i}$, $Post_{t}$, and $D_{i}Post_{t}$ in the sub-sample of high school dropouts (with $Grad_{i}=0$). Derive the coefficients for this sub-sample regression in terms of the coefficients in the full-sample regression (1). Repeat this exercise for the saturated regression of $Y_{it}$ on $D_{i}$, $Post_{t}$, and $D_{i}Post_{t}$ in the sub-sample of high school graduates (with $Grad_{i}=1$): what do the coefficients for this sub-sample regression equal, in terms of the coefficients in (4)?<\/p>\n\n\n\n<p>(b) [<span class=\"swl-marker mark_green\">8 Points<\/span>] Extending what we saw in lecture, state assumptions under which these two sub-sample regressions (in the $Grad_{i}=0$ and $Grad_{i}=1$ subsamples) identify the causal effects of minimum wage increases on employment for high school dropouts and graduates, respectively. Prove your claims.<\/p>\n\n\n\n<p>(c) [<span class=\"swl-marker mark_green\">7 Points<\/span>] Under the assumptions in (b), which coefficient in (4) yields a test for whether the minimum wage effects for high school dropouts and graduates differ? Use your answers in (a).<\/p>\n\n\n\n<p>(d) [<span class=\"swl-marker mark_green\">10 Points<\/span>] Suppose New Jersey and Pennslyvania were on different employment trends when the minimum wage was increased, such that your assumptions in (b) fail. However, suppose the $\\emph{difference}$ in employment trends across states is the $\\emph{same}$ for high school dropouts and graduates. Show that under this weaker assumption the coefficient from (c) still identifies the difference in minimum wage effects across the groups.<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (a)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>In the &nbsp;$Grad_{i}=0$ sub-sample, we obtain \\begin{align*}Y_{it} &amp; =\\mu+\\alpha D_{i}+\\tau Post_{t}+\\beta D_{i}Post_{t}+u_{it},\\end{align*}since the coefficients from these terms in (1) fit the elements of &nbsp;$E[Y_{it}\\mid D_{i},Post_{t},Grad_{t}=0]$. In the &nbsp;$Grad_{i}=1$ sub-sample, we obtain \\begin{align*}Y_{it} &amp; =(\\gamma+\\mu)+(\\alpha+\\psi)D_{i}+(\\tau+\\lambda)Post_{t}+(\\beta+\\pi)D_{i}Post_{t}+v_{it},\\end{align*}by the same logic.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (b)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Suppose, for each &nbsp;$g\\in\\{0,1\\}$, \\begin{align*}E[Y_{i2}(0)-Y_{i1}(0)\\mid D_{i}=1,Grad_{i}=g] &amp; =E[Y_{i2}(0)-Y_{i1}(0)\\mid D_{i}=0,Grad_{i}=g],\\end{align*}where we use the potential outcomes notation from class. Under these parallel trends assumptions we have \\begin{align*}\\beta &amp; =E[Y_{i2}-Y_{i1}\\mid D_{i}=1,Grad_{i}=0]-E[Y_{i2}-Y_{i1}\\mid D_{i}=0,Grad_{i}=0]\\\\ &amp; =E[Y_{i2}(1)-Y_{i1}(0)\\mid D_{i}=1,Grad_{i}=0]-E[Y_{i2}(0)-Y_{i1}(0)\\mid D_{i}=0,Grad_{i}=0]\\\\ &amp; =E[Y_{i2}(1)-Y_{i1}(0)\\mid D_{i}=1,Grad_{i}=0]-E[Y_{i2}(0)-Y_{i1}(0)\\mid D_{i}=1,Grad_{i}=0]\\\\ &amp; =E[Y_{i2}(1)-Y_{i2}(0)\\mid D_{i}=1,Grad_{i}=0],\\end{align*}following the proof in the lecture slides. Similarly, \\begin{align*}\\beta+\\pi &amp; =E[Y_{i2}-Y_{i1}\\mid D_{i}=1,Grad_{i}=1]-E[Y_{i2}-Y_{i1}\\mid D_{i}=0,Grad_{i}=1]\\\\ &amp; =E[Y_{i2}(1)-Y_{i2}(0)\\mid D_{i}=1,Grad_{i}=1].\\end{align*}<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (c)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The difference we wish to test is \\begin{align*} &amp; E[Y_{i2}(1)-Y_{i2}(0)\\mid D_{i}=1,Grad_{i}=1]-E[Y_{i2}(1)-Y_{i2}(0)\\mid D_{i}=1,Grad_{i}=0]\\\\ &amp; =(\\beta+\\pi)-\\beta\\\\ &amp; =\\pi\\end{align*}So we could test whether the coefficient on &nbsp;$D_{i}Grad_{i}Post_{t}$ in (1) is zero.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (d)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The \u201cdifference-in-difference-in-differences\u201d (sometimes called \u201ctriple-diff\u201d) regression coefficient gives \\begin{align*}\\pi\\text{=} &amp; E[Y_{i2}-Y_{i1}\\mid D_{i}=1,Grad_{i}=1]-E[Y_{i2}-Y_{i1}\\mid D_{i}=0,Grad_{i}=1]\\\\ &amp; -\\left(E[Y_{i2}-Y_{i1}\\mid D_{i}=1,Grad_{i}=0]-E[Y_{i2}-Y_{i1}\\mid D_{i}=0,Grad_{i}=0]\\right)\\\\= &amp; E[Y_{i2}(1)-Y_{i1}(0)\\mid D_{i}=1,Grad_{i}=1]-E[Y_{i2}(0)-Y_{i1}(0)\\mid D_{i}=0,Grad_{i}=1]\\\\ &amp; -\\left(E[Y_{i2}(1)-Y_{i1}(0)\\mid D_{i}=1,Grad_{i}=0]-E[Y_{i2}(0)-Y_{i1}(0)\\mid D_{i}=0,Grad_{i}=0]\\right)\\\\= &amp; \\underbrace{E[Y_{i2}(1)-Y_{i2}(0)\\mid D_{i}=1,Grad_{i}=1]-E[Y_{i2}(1)-Y_{i2}(0)\\mid D_{i}=1,Grad_{i}=0]}_{\\text{Parameter of interest}}\\\\ &amp; +\\underbrace{E[Y_{i2}(0)-Y_{i1}(0)\\mid D_{i}=1,Grad_{i}=1]-E[Y_{i2}(0)-Y_{i1}(0)\\mid D_{i}=0,Grad_{i}=1]}_{\\text{Difference in trends for }{Grad_{i}=1}}\\\\ &amp; -\\left(\\underbrace{E[Y_{i2}(0)-Y_{i1}(0)\\mid D_{i}=1,Grad_{i}=0]-E[Y_{i2}(0)-Y_{i1}(0)\\mid D_{i}=0,Grad_{i}=0]}_{\\text{Difference in trends for }Grad_{i}=0}\\right),\\end{align*}where the first equality uses the potential outcomes model and the second equality uses linearity of expectations and rearranges terms. The weaker assumption is that the two differences in trends are equal to each other (though not necessarily each zero). When this holds they cancel, and we are left with the parameter of interest.<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\">3 : [36 points : Empirics]<\/h3>\n\n\n\n<p>In this problem, you will look at how Medicaid expansions impact insurance coverage using publicly-available data that is similar to the (confidential) data used in <a href=\"https:\/\/www.dropbox.com\/s\/mgunjcebpgnb939\/Carey-et-al.pdf?dl=0\">Carey et al. (2020)<\/a>, which we discussed in class. The attached dataset $\\emph{ehec\\_data.dta}$ contains state-level panel data that shows the fraction of low-income childless adults who have health insurance in each year. Start by loading this data into Stata.<\/p>\n\n\n\n<p>(a) [<span class=\"swl-marker mark_green\">4 points<\/span>] Let&#8217;s first get a feel for the data. When you open a dataset, it&#8217;s good to use the $\\texttt{browse}$ command, which shows you the raw data. This helps you see how the data is structured.<\/p>\n\n\n\n<p>Run the command and report a screenshot of your results. Next, use the $\\texttt{tab}$ command to tabulate the year variable. Report a screenshot of your results. For what years is data available?<\/p>\n\n\n\n<p>(b) [<span class=\"swl-marker mark_green\">4 points<\/span>] The variable $\\texttt{yexp2}$ shows the first year that a state expanded Medicaid under the Affordable Care Act, and is missing if a state never expanded Medicaid. Use the $\\texttt{tab}$ command to&nbsp; figure out how many states in the data first expanded in each year, and report a screenshot of your result. How many states (in the data) first expanded in 2014? How many never expanded? Are all 50 states contained in the data? [Hint: you can use the \u2018\u2018, missing&#8221; option to tabulate missing values. Since you have panel data, each state will appear multiple times in the data, so you will want to only tabulate for a fixed year (e.g. add \u2018\u2018if year == 2009&#8221; option) so that each state only shows up once in your tabulations.]<\/p>\n\n\n\n<p>(c) [<span class=\"swl-marker mark_green\">5 points<\/span>] As in Carey et al, we will focus on the first two years of Medicaid expansion, 2014 and 2015. To simplify matters, drop the 3 states who first expanded in 2015 for the remainder of the analysis (since these states are partially treated during the time we&#8217;re studying). Create a variable $\\texttt{treatment}$ that is equal to 1 if a state expanded in 2014 and equal to 0 if a state never expanded or expanded after 2015. Tabulate your treatment variable (for a fixed year, as above) and make sure the number of treated and control states matches what you&#8217;d expect from your previous answers. Report a screenshot of your tabulate command.<\/p>\n\n\n\n<p>(d) [<span class=\"swl-marker mark_green\">6 points<\/span>] Using observations from 2013 and 2014 $\\textit{only}$, estimate the regression specification<\/p>\n\n\n\n<p>\\[Y_{it}=\\beta_{0}+1[t=2014]\\times\\beta_{1}+treatment_{i}\\times\\beta_{2}+treatment_{i}\\times1[t=2014]\\times\\beta_{3}+\\epsilon_{it}\\]<\/p>\n\n\n\n<p>where $Y_{it}$ denotes the insurance coverage rate of state $i$ in year $t$. Cluster your standard errors by state using the \u2018\u2018, cluster(stfips)&#8221; option (instead of the usual \u2018\u2018, r&#8221;). What is your difference-in-differences estimate of the effect of Medicaid expansion on coverage? Is it significant?<\/p>\n\n\n\n<p>(e) [<span class=\"swl-marker mark_green\">7 points<\/span>] One way to assess the plausibility of the key parallel trends assumption in difference-in-differences settings is to create an \u2018\u2018event-study plot&#8221; that allows us to assess pre-treatment differences in trends. That is, we compare the trends for the two groups both before and after the treatment occurred. To do this, create the variable $\\texttt{t2008}=\\texttt{treatment}\\times1[t=2008]$. Create analogous variables $\\texttt{t2009},&#8230;,\\texttt{t2019}$. Set $\\texttt{t2013}$ to 0 for all observations [Note: this normalizes the coefficient on $\\texttt{t2013}$, to 0. This is the same as omitting this variable from the regression, except including the zero variable in the regression in Stata makes it easier to plot the coefficients.] Regress $\\texttt{dins}$ on fixed effects for year, fixed effects for state, and the variables $\\texttt{t2008},&#8230;,\\texttt{t2019}$ you just created. That is, use OLS to estimate the regression<\/p>\n\n\n\n<p>\\[Y_{it}=\\phi_{i}+\\lambda_{t}+\\sum_{s\\neq2013}1[t=s]\\times treatment_{i}\\times\\beta_{s}+\\epsilon_{is}\\]<\/p>\n\n\n\n<p>[Note: you can specify fixed effects in a regression specification by writing \u2018\u2018i.stfips&#8221; for state fixed effects and \u2018\u2018i.year&#8221; for year fixed effects.] Again, remember to cluster your standard errors at the state level. Install the $\\texttt{coefplot}$ package by running \u2018\u2018ssc install coefplot&#8221;. Then, run the command \u2018\u2018coefplot, omitted keep(t2{*}) vertical&#8221; to create an event-study plot. Report a screenshot of both your regression results and the plot.<\/p>\n\n\n\n<p>(f) [<span class=\"swl-marker mark_green\">5 points<\/span>] Use the $\\texttt{test}$ command to test the joint null hypothesis that all of the pre-treament event-study coefficients, $\\beta_{2008},&#8230;,\\beta_{2012}$ are equal to zero. [Hint: the command \u2018\u2018test x1 x2&#8221; runs an F-test for the joint hypothesis that the coefficients on x1 and x2 are both zero.] What is the $p$-value from this joint $F$-test? Does this increase your confidence in the parallel trends assumption?<\/p>\n\n\n\n<p>(g) [<span class=\"swl-marker mark_green\">5 points<\/span>] Submit clean and well-commented code used for this question.<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (a)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">use ehec_data.dta, clear\nbrowse\ntab year<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. use ehec_data.dta, clear\n\n. br\n\n. tab year\n\n Census\/ACS |\nsurvey year |      Freq.     Percent        Cum.\n------------+-----------------------------------\n       2008 |         46        8.33        8.33\n       2009 |         46        8.33       16.67\n       2010 |         46        8.33       25.00\n       2011 |         46        8.33       33.33\n       2012 |         46        8.33       41.67\n       2013 |         46        8.33       50.00\n       2014 |         46        8.33       58.33\n       2015 |         46        8.33       66.67\n       2016 |         46        8.33       75.00\n       2017 |         46        8.33       83.33\n       2018 |         46        8.33       91.67\n       2019 |         46        8.33      100.00\n------------+-----------------------------------\n      Total |        552      100.00<\/code><\/pre>\n\n\n\n<p>Data is available for all years from 2008 to 2019.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (b)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">tab yexp2 if year == 2009, m<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. tab yexp2 if year == 2009, m\n\n    Year of |\n   Medicaid |\n  Expansion |      Freq.     Percent        Cum.\n------------+-----------------------------------\n       2014 |         22       47.83       47.83\n       2015 |          3        6.52       54.35\n       2016 |          2        4.35       58.70\n       2017 |          1        2.17       60.87\n       2019 |          2        4.35       65.22\n          . |         16       34.78      100.00\n------------+-----------------------------------\n      Total |         46      100.00<\/code><\/pre>\n\n\n\n<p>We only have data for 46 states. Of these, 22 expanded in 2014, 8 expanded at some point in time after 2014, and 16 never expanded.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (c)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">gen treatment = .\nreplace treatment = 1 if yexp2 == 2014\nreplace treatment = 0 if yexp2 >= 2016\ndrop if treatment == .\ntab treatment if year==2008, m<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. gen treatment = .\n(552 missing values generated)\n\n. replace treatment = 1 if yexp2 == 2014\n(264 real changes made)\n\n. replace treatment = 0 if yexp2 >= 2016\n(252 real changes made)\n\n. drop if treatment == .\n(36 observations deleted)\n\n. tab treatment if year==2008, m\n\n  treatment |      Freq.     Percent        Cum.\n------------+-----------------------------------\n          0 |         21       48.84       48.84\n          1 |         22       51.16      100.00\n------------+-----------------------------------\n      Total |         43      100.00<\/code><\/pre>\n\n\n\n<p>22 states expanded Medicare in 2014, while 21 expanded it in 2016 or later, or never. This coincides with what we would expect looking to the previous table.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (d)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">gen y2014 = (year == 2014)\ngen t_y2014 = y2014 * treatment\nreg dins treatment y2014 t_y2014 if year == 2013 | year == 2014, cluster(stfips)<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. gen y2014 = (year == 2014)\n\n. gen t_y2014 = y2014 * treatment\n\n. reg dins treatment y2014 t_y2014 if year == 2013 | year == 2014, cluster(stfips)\n\nLinear regression                               Number of obs     =         86\n                                                F(3, 42)          =      96.65\n                                                Prob > F          =     0.0000\n                                                R-squared         =     0.4586\n                                                Root MSE          =     .05336\n\n                                (Std. Err. adjusted for 43 clusters in stfips)\n------------------------------------------------------------------------------\n             |               Robust\n        dins |      Coef.   Std. Err.      t    P>|t|     &#91;95% Conf. Interval]\n-------------+----------------------------------------------------------------\n   treatment |   .0396753   .0159493     2.49   0.017     .0074883    .0718622\n       y2014 |   .0448456   .0060665     7.39   0.000     .0326029    .0570883\n     t_y2014 |   .0464469   .0091256     5.09   0.000     .0280306    .0648631\n       _cons |   .6227468    .009852    63.21   0.000     .6028648    .6426289\n------------------------------------------------------------------------------<\/code><\/pre>\n\n\n\n<p>I estimate a treatment effect of &nbsp;$\\hat{\\beta}_{3}\\approx0.046$ with a clustered standard error of &nbsp;$0.009$; so it&#8217;s highly statistically significant.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (e)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">forvalues yr = 2008\/2019{\n\tgen t`yr' = treatment * (year == `yr')\n}\ncap ssc install coefplot\nreplace t2013 = 0\nreg dins t2008-t2012 t2013 t2014-t2019 i.year i.stfips, cluster(stfips)\ncoefplot, omitted keep(t2*) vertical \ngraph export DD_1.png, replace<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. forvalues yr = 2008\/2019{\n  2.         gen t`yr' = treatment * (year == `yr')\n  3. }\n\n. cap ssc install coefplot\n \n. replace t2013 = 0\n(22 real changes made)\n\n. reg dins t2008-t2012 t2013 t2014-t2019 i.year i.stfips, cluster(stfips)\nnote: t2013 omitted because of collinearity\n\nLinear regression                               Number of obs     =        516\n                                                F(21, 42)         =          .\n                                                Prob > F          =          .\n                                                R-squared         =     0.9374\n                                                Root MSE          =      .0242\n\n                                   (Std. Err. adjusted for 43 clusters in stfips)\n---------------------------------------------------------------------------------\n                |               Robust\n           dins |      Coef.   Std. Err.      t    P>|t|     &#91;95% Conf. Interval]\n----------------+----------------------------------------------------------------\n          t2008 |  -.0052854   .0090566    -0.58   0.563    -.0235622    .0129915\n          t2009 |  -.0112973   .0089213    -1.27   0.212    -.0293013    .0067066\n          t2010 |   -.002676   .0074388    -0.36   0.721     -.017688     .012336\n          t2011 |  -.0014193   .0066217    -0.21   0.831    -.0147825    .0119439\n          t2012 |   .0003397   .0077351     0.04   0.965    -.0152705    .0159498\n          t2013 |          0  (omitted)\n          t2014 |   .0464469    .009578     4.85   0.000     .0271176    .0657761\n          t2015 |   .0692062    .010832     6.39   0.000     .0473463     .091066\n          t2016 |   .0747343   .0117466     6.36   0.000     .0510288    .0984399\n          t2017 |   .0642144    .012695     5.06   0.000     .0385948    .0898339\n          t2018 |   .0618816   .0146892     4.21   0.000     .0322376    .0915256\n          t2019 |   .0646171   .0130541     4.95   0.000     .0382728    .0909614\n                |\n           year |\n          2009  |  -.0110171   .0041383    -2.66   0.011    -.0193686   -.0026657\n          2010  |  -.0200235   .0049124    -4.08   0.000    -.0299371   -.0101098\n          2011  |  -.0184424   .0054814    -3.36   0.002    -.0295044   -.0073804\n          2012  |  -.0126684   .0043538    -2.91   0.006    -.0214547   -.0038822\n          2013  |   -.006946   .0064585    -1.08   0.288    -.0199798    .0060877\n          2014  |   .0378995   .0042739     8.87   0.000     .0292745    .0465246\n          2015  |   .0694425   .0081728     8.50   0.000     .0529492    .0859358\n          2016  |   .0848653   .0089196     9.51   0.000     .0668648    .1028657\n          2017  |   .0872879   .0101555     8.60   0.000     .0667932    .1077827\n          2018  |   .0892268   .0118061     7.56   0.000     .0654011    .1130525\n          2019  |   .0842069   .0117343     7.18   0.000     .0605261    .1078876\n                |\n         stfips |\n        alaska  |   -.103853   1.04e-15 -1.0e+14   0.000     -.103853    -.103853\n       arizona  |  -.0412094   .0067381    -6.12   0.000    -.0548075   -.0276113\n      arkansas  |  -.0117976   .0067381    -1.75   0.087    -.0253957    .0018005\n    california  |  -.0416807   .0067381    -6.19   0.000    -.0552788   -.0280825\n      colorado  |  -.0107549   .0067381    -1.60   0.118     -.024353    .0028433\n   connecticut  |   .0482399   .0067381     7.16   0.000     .0346418     .061838\n       florida  |  -.0857497   1.04e-15 -8.3e+13   0.000    -.0857497   -.0857497\n       georgia  |   -.090137   1.04e-15 -8.7e+13   0.000     -.090137    -.090137\n        hawaii  |   .1102658   .0067381    16.36   0.000     .0966677    .1238639\n         idaho  |  -.0128005   1.04e-15 -1.2e+13   0.000    -.0128005   -.0128005\n      illinois  |  -.0163106   .0067381    -2.42   0.020    -.0299087   -.0027125\n          iowa  |   .0876154   .0067381    13.00   0.000     .0740173    .1012135\n        kansas  |   .0138945   1.04e-15  1.3e+13   0.000     .0138945    .0138945\n      kentucky  |   .0309765   .0067381     4.60   0.000     .0173784    .0445747\n     louisiana  |  -.0358099   1.04e-15 -3.5e+13   0.000    -.0358099   -.0358099\n         maine  |   .0656128   1.04e-15  6.3e+13   0.000     .0656128    .0656128\n      maryland  |   .0118266   .0067381     1.76   0.087    -.0017715    .0254247\n      michigan  |   .0349109   .0067381     5.18   0.000     .0213128     .048509\n     minnesota  |   .0884664   .0067381    13.13   0.000     .0748682    .1020645\n   mississippi  |  -.0424017   1.04e-15 -4.1e+13   0.000    -.0424017   -.0424017\n      missouri  |   .0185215   1.04e-15  1.8e+13   0.000     .0185215    .0185215\n       montana  |   .0016449   1.04e-15  1.6e+12   0.000     .0016449    .0016449\n      nebraska  |   .0465129   1.04e-15  4.5e+13   0.000     .0465129    .0465129\n        nevada  |  -.0688877   .0067381   -10.22   0.000    -.0824858   -.0552896\n    new jersey  |  -.0539224   .0067381    -8.00   0.000    -.0675205   -.0403243\n    new mexico  |   -.035146   .0067381    -5.22   0.000    -.0487441   -.0215479\nnorth carolina  |  -.0214531   1.04e-15 -2.1e+13   0.000    -.0214531   -.0214531\n  north dakota  |   .0414656   .0067381     6.15   0.000     .0278675    .0550637\n          ohio  |   .0163148   .0067381     2.42   0.020     .0027167    .0299129\n      oklahoma  |  -.0662598   1.04e-15 -6.4e+13   0.000    -.0662598   -.0662598\n        oregon  |  -.0007891   .0067381    -0.12   0.907    -.0143872     .012809\n  rhode island  |   .0601783   .0067381     8.93   0.000     .0465801    .0737764\nsouth carolina  |  -.0346476   1.04e-15 -3.3e+13   0.000    -.0346476   -.0346476\n  south dakota  |   .0173781   1.04e-15  1.7e+13   0.000     .0173781    .0173781\n     tennessee  |  -.0172016   1.04e-15 -1.7e+13   0.000    -.0172016   -.0172016\n         texas  |  -.1207823   1.04e-15 -1.2e+14   0.000    -.1207823   -.1207823\n          utah  |  -.0098695   1.04e-15 -9.5e+12   0.000    -.0098695   -.0098695\n      virginia  |   .0046849   1.04e-15  4.5e+12   0.000     .0046849    .0046849\n    washington  |   .0179123   .0067381     2.66   0.011     .0043142    .0315104\n west virginia  |   .0310248   .0067381     4.60   0.000     .0174267     .044623\n     wisconsin  |   .0494254   .0067381     7.34   0.000     .0358273    .0630235\n       wyoming  |  -.0281642   1.04e-15 -2.7e+13   0.000    -.0281642   -.0281642\n                |\n          _cons |   .6535443   .0051142   127.79   0.000     .6432234    .6638652\n---------------------------------------------------------------------------------\n\n. coefplot, omitted keep(t2*) vertical \n\n. graph export DD_1.png, replace\n(note: file DD_1.png not found)\n(file DD_1.png written in PNG format)<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (f)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">test t2008 t2009 t2010 t2011 t2012<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. test t2008 t2009 t2010 t2011 t2012\n\n ( 1)  t2008 = 0\n ( 2)  t2009 = 0\n ( 3)  t2010 = 0\n ( 4)  t2011 = 0\n ( 5)  t2012 = 0\n\n       F(  5,    42) =    0.76\n            Prob > F =    0.5856<\/code><\/pre>\n\n\n\n<p>I get a p-value of &nbsp;$.58$, which means we can&#8217;t reject the null hypothesis that treated and control states had parallel trends in 2008-2013. This increases my confidence in parallel trends holding in 2013-2019, though of course it is not a direct test of this.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Solution to (g)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Homework3.do<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">* Part (a)\nuse ehec_data.dta, clear\nbrowse\ntab year\n* Part (b)\ntab yexp2 if year == 2009, m\n* Part (c)\ngen treatment = .\nreplace treatment = 1 if yexp2 == 2014\nreplace treatment = 0 if yexp2 >= 2016\ndrop if treatment == .\ntab treatment if year==2008, m\n* Part (d)\ngen y2014 = (year == 2014)\ngen t_y2014 = y2014 * treatment\nreg dins treatment y2014 t_y2014 if year == 2013 | year == 2014, cluster(stfips)\n* Part (e)\nforvalues yr = 2008\/2019{\n\tgen t`yr' = treatment * (year == `yr')\n}\ncap ssc install coefplot\nreplace t2013 = 0\nreg dins t2008-t2012 t2013 t2014-t2019 i.year i.stfips, cluster(stfips)\ncoefplot, omitted keep(t2*) vertical \ngraph export DD_1.png, replace\n* Part (f)\ntest t2008 t2009 t2010 t2011 t2012<\/code><\/pre>\n<\/div><\/details>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Lab 1<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Basic STATA<\/h3>\n\n\n\n<p>Use the data <strong>gdbcn.csv<\/strong>: GDP of China in 1992-2003, performing the following operations using STATA.<\/p>\n\n\n\n<p>Please write the corresponding STATA query statements for the following requirements based on the file mentioned.<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">1. Import the data<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">cd Lab1\nimport delimited using gdbcn.csv, encoding(GB2312)<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. cd Lab1\nLab1\n\n. import delimited using gdbcn.csv, encoding(GB2312)\n(3 vars, 380 obs)<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">2. How many observations are there?<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">count<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. count\n  380<\/code><\/pre>\n\n\n\n<p>There are 380 observations.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">3. How many variables are there, and what are their names?<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">describe<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. describe\n\nContains data\n  obs:           380                          \n vars:             3                          \n size:         5,700                          \n--------------------------------------------------------------------------------------------\n              storage   display    value\nvariable name   type    format     label      variable label\n--------------------------------------------------------------------------------------------\nThrhold_enddt  str10   %10s                  Thrhold_EndDt\nGDP_P_C_GDP~u  float   %9.0g                 GDP_P_C_GDP_Pric_Cumu\nv3             byte    %8.0g                 \n--------------------------------------------------------------------------------------------\nSorted by: \n     Note: Dataset has changed since last saved.<\/code><\/pre>\n\n\n\n<p>There are 2 variables, one named Thrhold_enddt, the other is GDP_P_C~u. v3 is a dummy variable, that is due to the bad format of the csv file.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">4. What does the second variable mean? (Determine through its label).<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>GDP_Price_Cumulative<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">5. What is the mean of the Gross Domestic Product (GDP)?<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">summarize GDP_Pric_Cumu<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. summarize GDP_P_C\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\nGDP_P_C_GD~u |        126    247994.2    271458.1     5262.8    1210207<\/code><\/pre>\n\n\n\n<p>247994.2<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">6. Output the number of missing values for each variable.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">misstable summarize<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code>. misstable summarize\n                                                               Obs&lt;.\n                                                +------------------------------\n               |                                | Unique\n      Variable |     Obs=.     Obs&gt;.     Obs&lt;.  | values        Min         Max\n  -------------+--------------------------------+------------------------------\n  GDP_P_C_GD~u |       254                 126  |    126     5262.8     1210207\n            v3 |       380                   0  |      0          .           .\n  -----------------------------------------------------------------------------<\/code><\/pre>\n\n\n\n<p>254 null values on the variable: GDP_P_C_GDP~u<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Regressive Analysis<\/h3>\n\n\n\n<p>Using the data from <strong>HPRICE1<\/strong>, estimate the following model:$$\\text{price} = \\beta_0 + \\beta_1 \\cdot \\text{sqrft} + \\beta_2 \\cdot \\text{bdrms} + \\mu$$<\/p>\n\n\n\n<p>where <strong>price<\/strong> represents the housing price in thousands of dollars.<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">1. Write the result in equation form.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">cd Lab1\nuse hprice1.dta, clear\ndescribe\nreg price sqrft bdrms<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. cd Lab1\nLab1\n\n. use hprice1.dta, clear\n\n. describe\n\nContains data from Lab1\/hprice1.dta\n  obs:            88                          \n vars:            10                          17 Mar 2002 12:21\n size:         2,816                          \n-------------------------------------------------------------------------------\n              storage   display    value\nvariable name   type    format     label      variable label\n-------------------------------------------------------------------------------\nprice           float   %9.0g                 house price, $1000s\nassess          float   %9.0g                 assessed value, $1000s\nbdrms           byte    %9.0g                 number of bdrms\nlotsize         float   %9.0g                 size of lot in square feet\nsqrft           int     %9.0g                 size of house in square feet\ncolonial        byte    %9.0g                 =1 if home is colonial style\nlprice          float   %9.0g                 log(price)\nlassess         float   %9.0g                 log(assess\nllotsize        float   %9.0g                 log(lotsize)\nlsqrft          float   %9.0g                 log(sqrft)\n-------------------------------------------------------------------------------\nSorted by: \n\n. reg price sqrft bdrms\n\n      Source |       SS           df       MS      Number of obs   =        88\n-------------+----------------------------------   F(2, 85)        =     72.96\n       Model |  580009.152         2  290004.576   Prob > F        =    0.0000\n    Residual |  337845.354        85  3974.65122   R-squared       =    0.6319\n-------------+----------------------------------   Adj R-squared   =    0.6233\n       Total |  917854.506        87  10550.0518   Root MSE        =    63.045\n\n------------------------------------------------------------------------------\n       price |      Coef.   Std. Err.      t    P>|t|     &#91;95% Conf. Interval]\n-------------+----------------------------------------------------------------\n       sqrft |   .1284362   .0138245     9.29   0.000     .1009495    .1559229\n       bdrms |   15.19819   9.483517     1.60   0.113    -3.657582    34.05396\n       _cons |    -19.315   31.04662    -0.62   0.536    -81.04399      42.414\n------------------------------------------------------------------------------<\/code><\/pre>\n\n\n\n<p>We thus give the model as $$\\text{price} = -19.315 + 0.12844 \\cdot \\text{sqrft} + 15.198 \\cdot \\text{bdrms} + \\mu$$<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">2. Estimate the increase in price when a bedroom is added without changing the area.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The coefficient for <code>bdrms<\/code> is $\\beta_2 = 15.198$.<br>This means that adding one bedroom, while keeping square footage constant, is estimated to increase the price by <span class=\"swl-marker mark_yellow\">$15,198<\/span>.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">3. Estimate the effect of adding a bedroom that is 140 square feet in size. Compare this result with the one obtained in part (2).<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The total impact of adding a bedroom with 140 square feet is the sum of the effects of the additional square footage and the additional bedroom:$\\Delta \\text{price} = \\beta_1 \\cdot 140 + \\beta_2$.<\/p>\n\n\n\n<p>Substituting the coefficients:$\\Delta \\text{price} = 0.12844 \\cdot 140 + 15.198 = 17.9816 + 15.198 = 33.1796$.<\/p>\n\n\n\n<p>Thus, the price is estimated to increase by <span class=\"swl-marker mark_yellow\">$33,180<\/span> when a bedroom with 140 square feet is added.<\/p>\n\n\n\n<p><strong>Comparison with Part 2:<\/strong><br>The price increase from adding a bedroom with 140 square feet is higher than adding a bedroom alone because the additional square footage also adds value.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">4. Determine the proportion of price variation explained by square footage and the number of bedrooms.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>The $R^2$ value from the regression output is 0.6319.<\/p>\n\n\n\n<p>This indicates that <span class=\"swl-marker mark_yellow\">63.19%<\/span> of the variation in housing prices can be explained by the square footage ($\\text{sqrft}$) and the number of bedrooms ($\\text{bdrms}$) in the model.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">5. Predict the sales price of the first house in the sample.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">gen predicted_price = _b&#91;_cons] + _b&#91;sqrft]*sqrft + _b&#91;bdrms]*bdrms\nlist predicted_price if _n==1<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. gen predicted_price = _b&#91;_cons] + _b&#91;sqrft]*sqrft + _b&#91;bdrms]*bdrms\n\n. list predicted_price if _n==1\n\n     +-----------------+\n     | predicted_price |\n     |-----------------|\n  1. |     354.6053    |\n     +-----------------+<\/code><\/pre>\n\n\n\n<p>The predicted price is <span class=\"swl-marker mark_yellow\">$354,605<\/span>.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">6. Given the actual price of $300,000 on the first house, compute the residual. Assess whether the buyer paid more or less based on the sign of the residual.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">gen residual = price - predicted_price\nlist residual if _n == 1<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. gen residual = price - predicted_price\n\n. list residual if _n == 1\n\n     +-----------+\n     |  residual |\n     |-----------|\n  1. | -54.60526 |\n     +-----------+<\/code><\/pre>\n\n\n\n<p>The residual is <span class=\"swl-marker mark_yellow\">-$54,605<\/span>, indicating the buyer paid less.<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Lab 2<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Data Visualization<\/h3>\n\n\n\n<p><strong>Experiment Requirements:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-style-photo_frame\"><img decoding=\"async\" width=\"2544\" height=\"910\" src=\"https:\/\/www.yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-0.png\" alt=\"\" class=\"wp-image-1923\" srcset=\"https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-0.png 2544w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-0-300x107.png 300w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-0-1024x366.png 1024w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-0-768x275.png 768w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-0-1536x549.png 1536w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-0-2048x733.png 2048w\" sizes=\"(max-width: 2544px) 100vw, 2544px\" \/><\/figure>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Complete the drawing of the two figures above (60%)<\/li>\n\n\n\n<li>Optimize the figures (e.g., titles, labels, coordinates, etc., you do not have to draw this exactly the same as the figures given) (30%)<\/li>\n\n\n\n<li>Analyze the visualization results (10%)<\/li>\n<\/ol>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>The first figure<\/strong><\/h4>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">The first figure<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">cd Lab2\nuse wdipol.dta, clear\ndescribe\nkeep if inlist(country, \"Ireland\",\"Kuwait\",\"Luxembourg\",\"Norway\",\"Qatar\",\"Singapore\",\"United States\")\negen max_gdppc = max(gdppc) if country==\"Ireland\"\ndrop if country==\"Ireland\" &amp; gdppc&lt;max_gdppc\ndrop if (country==\"Singapore\" | country==\"United States\") &amp; year&lt;2000\nsort country year\npreserve\nkeep if country==\"Kuwait\"\nsort year\nscalar kuwait_first = gdppc&#91;1]\nrestore\nsort country year\nreplace gdppc = . if country==\"Kuwait\" &amp; gdppc &lt; kuwait_first &amp; _n > 1\ngraph twoway (connected gdppc year, msymbol(diamond) mcolor(blue) lcolor(blue)), by(country, cols(3) compact note(\"Graphs by Country Name\")) ytitle(\"GDPper capital PPP (constant 2005 international $\") xtitle(\"Year\") legend(off) yscale(range(40000 .))<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. cd Lab2\nLab2\n\n. use wdipol.dta, clear\n\n. describe\n\nContains data from wdipol.dta\n  obs:         4,542                          \n vars:            12                          25 Feb 2015 17:31\n size:       381,528                          \n--------------------------------------------------------------------------------------------------\n              storage   display    value\nvariable name   type    format     label      variable label\n--------------------------------------------------------------------------------------------------\nyear            int     %10.0g                Year\ncountry         str24   %24s                  Country Name\ngdppc           double  %10.0g                 GDP per capita, PPP (constant 2005 international $)\nunempf          double  %10.0g                Unemployment, female (% of female labor force)\nunempm          double  %10.0g                Unemployment, male (% of male labor force)\nunemp           double  %10.0g                Unemployment, total (% of total labor force)\nexport          double  %10.0g                Exports of goods and services (constant 2005 US$)\nimport          double  %10.0g                Imports of goods and services (constant 2005 US$)\npolity          byte    %8.0g                 polity (original)\npolity2         byte    %8.0g                 polity2 (adjusted)\ntrade           float   %9.0g                 Imports + Exports\nid              float   %9.0g                 group(country)\n-------------------------------------------------------------------------------------------------\nSorted by: \n\n. keep if inlist(country, \"Ireland\",\"Kuwait\",\"Luxembourg\",\"Norway\",\"Qatar\",\"Singapore\",\"United States\")\n(4,358 observations deleted)\n\n. egen max_gdppc = max(gdppc) if country==\"Ireland\"\n(171 missing values generated)\n\n. drop if country==\"Ireland\" &amp; gdppc&lt;max_gdppc\n(12 observations deleted)\n\n. drop if (country==\"Singapore\" | country==\"United States\") &amp; year&lt;2000\n(40 observations deleted)\n\n. sort country year\n\n. preserve\n\n. keep if country==\"Kuwait\"\n(105 observations deleted)\n\n. sort year\n\n. scalar kuwait_first = gdppc&#91;1]\n\n. restore\n\n. sort country year\n\n. replace gdppc = . if country==\"Kuwait\" &amp; gdppc &lt; kuwait_first &amp; _n &gt; 1\n(13 real changes made, 13 to missing)\n\n. graph twoway (connected gdppc year, msymbol(diamond) mcolor(blue) lcolor(blue)), by(country, cols(3) compact note(\"Graphs by\n&gt;  Country Name\")) ytitle(\"GDPper capital PPP (constant 2005 international $\") xtitle(\"Year\") legend(off) yscale(range(40000 .\n&gt; ))<\/code><\/pre>\n\n\n\n<p>Then you can use the graph editor to modify the layout.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-style-browser_mac\"><img decoding=\"async\" width=\"1127\" height=\"819\" src=\"https:\/\/www.yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-1.png\" alt=\"\" class=\"wp-image-1924\" srcset=\"https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-1.png 1127w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-1-300x218.png 300w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-1-1024x744.png 1024w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-1-768x558.png 768w\" sizes=\"(max-width: 1127px) 100vw, 1127px\" \/><\/figure>\n<\/div><\/details>\n<\/div>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>The second figure<\/strong><\/h4>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">The second figure<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">cd Lab2\nuse wdipol.dta, clearkeep if inlist(country, \"Australia\", \"Qatar\", \"United Kingdom\", \"United States\")\nsort country year\ngraph twoway (connected gdppc year, msymbol(o) mcolor(blue) lcolor(blue)), by(country, rows(2) compact note(\"Graphs by Country Name\")) title(\"GDP pc (PPP, 2005=100)\") ytitle(\"GDP per capita, PPP (Constant 2005 international $)\") xtitle(\"Year\") legend(off)<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. cd Lab2\nLab2\n\n. use wdipol.dta, clear\n\n. keep if inlist(country, \"Australia\", \"Qatar\", \"United Kingdom\", \"United States\")\n(4,431 observations deleted)\n\n. sort country year\n\n. graph twoway (connected gdppc year, msymbol(o) mcolor(blue) lcolor(blue)), by(country, rows(2) compact note(\"Graphs by Count\n> ry Name\")) title(\"GDP pc (PPP, 2005=100)\") ytitle(\"GDP per capita, PPP (Constant 2005 international $)\") xtitle(\"Year\") lege\n> nd(off)<\/code><\/pre>\n\n\n\n<p>Then you can use the graph editor to modify the layout.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-style-browser_mac\"><img decoding=\"async\" width=\"1127\" height=\"819\" src=\"https:\/\/www.yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-2.png\" alt=\"\" class=\"wp-image-1925\" srcset=\"https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-2.png 1127w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-2-300x218.png 300w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-2-1024x744.png 1024w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab2-1-2-768x558.png 768w\" sizes=\"(max-width: 1127px) 100vw, 1127px\" \/><\/figure>\n<\/div><\/details>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Data Visualization in Econometrics<\/h3>\n\n\n\n<h4 class=\"wp-block-heading\">SLEEP75<\/h4>\n\n\n\n<p>Using the SLEEP75 data from Biddle and Hamermesh (1990), examine whether there is a trade-off between the time spent sleeping each week and the time spent on paid work. We can use either of these variables as the dependent variable.<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">1. Estimate the model: $$\\text{sleep} = \\beta_0 + \\beta_1 \\text{totwrk} + \\mu$$<br>Where $\\text{sleep}$ represents the number of minutes spent sleeping at night each week, and $\\text{totwrk}$ represents the number of minutes spent on paid work during the same week. Report your results in equation form, along with the number of observations and $R^2$. What does the intercept in this equation represent?<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">cd Lab2\nuse SLEEP75.dta\ndescribe sleep totwrk\nreg sleep totwrk<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. cd Lab2\nLab2\n\n. use SLEEP75.dta\n\n. describe sleep totwrk\n\n              storage   display    value\nvariable name   type    format     label      variable label\n------------------------------------------------------------------------------------------------------------------------------\nsleep           int     %9.0g                 mins sleep at night, per wk\ntotwrk          int     %9.0g                 mins worked per week\n\n. reg sleep totwrk\n\n      Source |       SS           df       MS      Number of obs   =       706\n-------------+----------------------------------   F(1, 704)       =     81.09\n       Model |  14381717.2         1  14381717.2   Prob > F        =    0.0000\n    Residual |   124858119       704  177355.282   R-squared       =    0.1033\n-------------+----------------------------------   Adj R-squared   =    0.1020\n       Total |   139239836       705  197503.313   Root MSE        =    421.14\n\n------------------------------------------------------------------------------\n       sleep |      Coef.   Std. Err.      t    P>|t|     &#91;95% Conf. Interval]\n-------------+----------------------------------------------------------------\n      totwrk |  -.1507458   .0167403    -9.00   0.000    -.1836126    -.117879\n       _cons |   3586.377   38.91243    92.17   0.000     3509.979    3662.775\n------------------------------------------------------------------------------<\/code><\/pre>\n\n\n\n<p>From the results above, we have the model: $$\\text{sleep} = 3586.377 &#8211; 0.1507458 \\cdot \\text{totwrk} + \\mu$$<\/p>\n\n\n\n<p>The intercept $\\beta_0$\u200b represents the expected number of minutes of nightly sleep per week when $\\text{totwrk}= 0$. In other words, it reflects the predicted total weekly nighttime sleep in the absence of paid work.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">2. If $\\text{totwrk}$ increases by 2 hours, by how much is $\\text{sleep}$ estimated to decrease? Do you think this is a significant effect?<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>If $\\text{totwrk}$ increases by 2 hours, or 120 minutes, the estimated decrease in $\\text{sleep}$ is calculated as:$\\Delta \\text{sleep} = \\beta_1 \\times 120 = -0.1507458 \\times 120 \\approx -18 \\text{ minutes}$.<\/p>\n\n\n\n<p>An additional 2 hours of work per week results in only an 18-minute reduction in sleep, which is not particularly significant. From a weekly perspective, this is seemingly a relatively small impact.<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<h4 class=\"wp-block-heading\">WAGE2<\/h4>\n\n\n\n<p>Using data from WAGE2, estimate a simple regression to explain monthly wages using intelligence quotient.<\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">1. Calculate the average (here, you can use mean value to represent the average value) wage and the average IQ in the sample. What is the sample standard deviation of IQ? (In the population, IQ is standardized with a mean of 100 and a standard deviation of 15.)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">use WAGE2.dta, clear\nsummarize wage IQ<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. use WAGE2.dta, clear\n\n. summarize wage IQ\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\n        wage |        935    957.9455    404.3608        115       3078\n          IQ |        935    101.2824    15.05264         50        145<\/code><\/pre>\n\n\n\n<p>Mean wage : 957.9455<br>Mean IQ : 101.2824<br>Standard deviation of IQ : 15.05264<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">2. Estimate a simple regression model where an increase of one unit in IQ results in a specific change in <em>wage<\/em>. Using this model, calculate the expected change in wages when IQ increases by 15 units. Does IQ explain most of the variation in wages?<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Here, we use <strong>Linear Model<\/strong>.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">reg wage IQ<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. reg wage IQ\n\n      Source |       SS           df       MS      Number of obs   =       935\n-------------+----------------------------------   F(1, 933)       =     98.55\n       Model |  14589782.6         1  14589782.6   Prob > F        =    0.0000\n    Residual |   138126386       933  148045.429   R-squared       =    0.0955\n-------------+----------------------------------   Adj R-squared   =    0.0946\n       Total |   152716168       934  163507.675   Root MSE        =    384.77\n\n------------------------------------------------------------------------------\n        wage |      Coef.   Std. Err.      t    P>|t|     &#91;95% Conf. Interval]\n-------------+----------------------------------------------------------------\n          IQ |   8.303064   .8363951     9.93   0.000     6.661631    9.944498\n       _cons |   116.9916   85.64153     1.37   0.172    -51.08078    285.0639\n------------------------------------------------------------------------------<\/code><\/pre>\n\n\n\n<p>$$\\text{wage} = 116.916 + 8.303064 \\text{IQ}$$<\/p>\n\n\n\n<p>An increase of 15 IQ points (approximately one standard deviation) would result in an estimated wage increase of $15 \\times 8.303064 \\approx 124.55$ USD per month.<\/p>\n\n\n\n<p>$R^2 \\approx 0.0955$ indicates that IQ explains less than 10% of the variation in wages. Most of the wage variation is determined by factors other than IQ.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">3. Now estimate a model where an increase of one unit in IQ has the same percentage impact on wages. If IQ increases by 15 units, what is the approximate expected percentage increase in wages?<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p>Here, we use <strong>Log-Linear Model<\/strong>.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">reg lwage IQ<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. reg lwage IQ\n\n      Source |       SS           df       MS      Number of obs   =       935\n-------------+----------------------------------   F(1, 933)       =    102.62\n       Model |  16.4150939         1  16.4150939   Prob > F        =    0.0000\n    Residual |  149.241189       933  .159958402   R-squared       =    0.0991\n-------------+----------------------------------   Adj R-squared   =    0.0981\n       Total |  165.656283       934  .177362188   Root MSE        =    .39995\n\n------------------------------------------------------------------------------\n       lwage |      Coef.   Std. Err.      t    P>|t|     &#91;95% Conf. Interval]\n-------------+----------------------------------------------------------------\n          IQ |   .0088072   .0008694    10.13   0.000      .007101    .0105134\n       _cons |   5.886994   .0890206    66.13   0.000     5.712291    6.061698\n------------------------------------------------------------------------------<\/code><\/pre>\n\n\n\n<p>$$\\ln(\\text{wage}) = 5.887 + 0.0088072 \\text{IQ}$$<\/p>\n\n\n\n<p>An increase of 15 IQ points would lead to an estimated wage increase of $15 \\times 0.88\\% \\approx 13.2\\%$.<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Lab 3<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Macro<\/h3>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">Use macro to draw a heart curve. We suggest you using the following curve:$$\\begin{cases} x = \\sin(t) \\cos(t) \\ln(|t|), \\\\ y = |t|^{0.3} \\sqrt{\\cos(t)}, \\end{cases} \\quad t \\in \\left[-\\frac{\\pi}{2}, \\frac{\\pi}{2}\\right]$$<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">clear\nset obs 50000\ntempvar t\ngen `t' = runiform(-0.5 * _pi, 0.5 * _pi)\nsort `t'\nlocal heart\nlocal points = 50\nlocal runs = 200\nlocal i = 1\nwhile `i' &lt;= `runs' {\n    display \"`i'\"\n    tempvar control`i' x`i' y`i'\n    gen `control`i'' = int(runiform(1,_N))\n    gen `x`i'' = sin(`t')*cos(`t')*ln(abs(`t')) if `control`i'' &lt;= `points'\n    gen `y`i'' = (abs(`t'))^(0.3)*(cos(`t'))^(0.5) if `control`i'' &lt;= `points'\n    local heart `heart' (area `y`i'' `x`i'', nodropbase lc(black) lw(vthin) fc(red%5))\n    local i = `i' + 1\n}\ntwoway `heart', aspect(0.8) xscale(off) yscale(off) xlabel(, nogrid) ylabel(, nogrid) legend(off) xsize(1) ysize(1)<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. clear\n\n. set obs 50000\nnumber of observations (_N) was 0, now 50,000\n\n. tempvar t\n\n. gen `t' = runiform(-0.5 * _pi, 0.5 * _pi)\n\n. sort `t'\n\n. local heart\n\n. local points = 50\n\n. local runs = 200\n\n. local i = 1\n\n. while `i' &lt;= `runs' {\n  2. display \"`i'\"\n  3. tempvar control`i' x`i' y`i'\n  4. gen `control`i'' = int(runiform(1,_N))\n  5. gen `x`i'' = sin(`t')*cos(`t')*ln(abs(`t')) if `control`i'' &lt;= `points'\n  6. gen `y`i'' = (abs(`t'))^(0.3)*(cos(`t'))^(0.5) if `control`i'' &lt;= `points'\n  7. local heart `heart' (area `y`i'' `x`i'', nodropbase lc(black) lw(vthin) fc(red%5))\n  8. local i = `i' + 1\n  9. }\n1\n(49,943 missing values generated)\n(49,943 missing values generated)\n2\n(49,949 missing values generated)\n(49,949 missing values generated)\n3\n(49,945 missing values generated)\n(49,945 missing values generated)\n........................................................\n200\n(49,950 missing values generated)\n(49,950 missing values generated)\n\n. twoway `heart', aspect(0.8) xscale(off) yscale(off) xlabel(, nogrid) ylabel(, nogrid) legend(off) xsize(1) ysize(1)<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-full is-style-default\"><img decoding=\"async\" width=\"821\" height=\"821\" src=\"https:\/\/www.yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-1.png\" alt=\"\" class=\"wp-image-1926\" srcset=\"https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-1.png 821w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-1-300x300.png 300w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-1-150x150.png 150w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-1-768x768.png 768w\" sizes=\"(max-width: 821px) 100vw, 821px\" \/><\/figure>\n<\/div><\/details>\n<\/div>\n\n\n\n<h3 class=\"wp-block-heading\">Group Assignment<\/h3>\n\n\n\n<p><strong>Requirements<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use a <code>.do<\/code> file to collect all commands.<\/li>\n\n\n\n<li>Data is from the paper <a href=\"https:\/\/www.hbs.edu\/ris\/Publication%20Files\/Americans%20Do%20IT%20Better_7cd547bc-c5f3-49b0-acf6-d0d4bc038a38.pdf\">Americans Do IT Better: US Multinationals and the Productivity Miracle<\/a> by Nick Bloom, Rafaella Sadun, and John van Reenen, forthcoming in the <em>American Economic Review.<\/em><\/li>\n\n\n\n<li>Submit as a group; only one submission per group is required.<\/li>\n\n\n\n<li>The submission format should be: <code>StudentID1+Name1+StudentID2+Name2.do<\/code>, for example: <code>202422+Amamitsu+202423+Yanagi.do<\/code>.<\/li>\n\n\n\n<li>Data and the paper are attached.<\/li>\n\n\n\n<li>At the beginning of the <code>.do<\/code> file, include comments listing the student ID and name of every group member.<\/li>\n\n\n\n<li>Use comments to label each question with its corresponding number. (If a question number is missing, it will be treated as incomplete.)<\/li>\n\n\n\n<li>For questions requiring explanations, answer using comments.<\/li>\n\n\n\n<li>Ensure your <code>.do<\/code> file can execute correctly without errors.<\/li>\n<\/ul>\n\n\n\n<p><strong>Questions<\/strong><\/p>\n\n\n\n<div class=\"swell-block-accordion\">\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">1. Open the dataset <code>replicate.dta<\/code>.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">cd Lab3\nuse replicate.dta, clear<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. cd Lab3\nLab3\n\n. use replicate.dta, clear\n<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">2. Use the <code>describe<\/code> command to determine the number of observations and identify the variable containing &#8220;people management&#8221; score information.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">describe<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. describe\n\nContains data from replicate.dta\n  obs:         8,417                          \n vars:            33                          17 Oct 2011 19:33\n size:       942,704                          (_dta has notes)\n--------------------------------------------------------------------------------------------------------------------------\n              storage   display    value\nvariable name   type    format     label      variable label\n--------------------------------------------------------------------------------------------------------------------------\nanalyst         str10   %10s                  Person that ran the interview\ncompany_code    int     %9.0g                 Individual company level code - not the actual BVD number for anonymity\ncover           float   %9.0g                 Share of employees in the firm surveyed by Harte-Hanks\ncty             str2    %9s                   Country\ndu_oth_mu       byte    %9.0g                 Non-US multinational\ndu_usa_mu       byte    %9.0g                 US multinational\nemployees_a     int     %8.0g                 Firm level employees\nhours_t         float   %9.0g                 Average works worked by employees\ninterview       int     %9.0g                 Code for each interview - ordered by company and interviewer\nlcap            float   %9.0g                 Log(net tangible fixed assets) in current dollars per employee\nldegree_t       float   %9.0g                 Log(employees with a degree), with missing set to -99\nldegree_t_miss  byte    %9.0g                 Missing dummy for Log(employees with a degree)\nlemp            float   %9.0g                 Log(employees in the firm)\nlmat            float   %9.0g                 Log(materials) in current dollars\nlpcemp          float   %9.0g                 Log of computers per employee, set to zero for missing values\nlpcemp_du_oth~u float   %9.0g                 Interaction of log(pcemp) with non-US multinational ownership\nlpcemp_du_usa~u float   %9.0g                 Interaction of log(pcemp) with US multinational ownership\nlpcemp_ldegre~t float   %9.0g                 log(pcemp) interacted with log(degree)\nlpcemp_ldegre~s float   %9.0g                 log(pcemp) interacted with log(degree)_miss\nlpcemp_peeps    float   %9.0g                 Interaction of log(pcemp) with people management\nly              float   %9.0g                 Log(sales) in current dollars\nmanagement      float   %9.0g                 Average of all management practices z-scores, normalized to SD of 1\nmonitoring      float   %9.0g                 Average of monitoring management practices z-scores, normalized to SD of 1\noperations      float   %9.0g                 Average of operations management practices z-scores, normalized to SD of 1\npeeps           float   %9.0g                 Average of people management, normalized to SD of 1\npublic          byte    %8.0g      public     Publicly listed company, -99 for missing\npublicmiss      byte    %9.0g                 Publicly listed company missing dummy\ns_count         byte    %9.0g                 1=Unique match of HH site to BVD code. 0=Multiple matches or jumps, .=no match\nsic             int     %8.0g                 US Sic code\ntargets         float   %9.0g                 Average of targets management practices z-scores, normalized to SD of 1\nunion           float   %8.0g                 Pct of union members\nwages_a         double  %8.0g                 Cost of employees, 000$\nyear            int     %9.0g                 year of the accounts and IT data (all management data collected in 2006)\n--------------------------------------------------------------------------------------------------------------------------\nSorted by: interview  year<\/code><\/pre>\n\n\n\n<p>From the description, we know that <code>peeps<\/code> holds the information of people management, the corresponding log value is <code>lpcemp_peeps<\/code>.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">3. Find the mean of the &#8220;people management&#8221; score.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">summarize peeps, detail<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. summarize peeps, detail\n\n      Average of people management, normalized to SD of\n                              1\n-------------------------------------------------------------\n      Percentiles      Smallest\n 1%    -1.464864      -1.693648\n 5%    -1.214176      -1.693648\n10%    -.9772028      -1.693648       Obs               8,417\n25%    -.5124432      -1.693648       Sum of Wgt.       8,417\n\n50%    -.0391659                      Mean          -.0192126\n                        Largest       Std. Dev.      .7060643\n75%      .433783       2.087268\n90%      .906063       2.087268       Variance       .4985268\n95%     1.148562       2.087268       Skewness       .1634675\n99%     1.621511       2.087268       Kurtosis       2.699202<\/code><\/pre>\n\n\n\n<p>The mean value is -0.0192.<\/p>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">4. Use the <code>tabulate<\/code> command to identify the countries and years in the sample, and the number of observations for each year and country.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">tabulate cty year, missing<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. tabulate cty year, missing\n\n           |        year of the accounts and IT data (all management data collected in 2006)\n   Country |      1999       2000       2001       2002       2003       2004       2005       2006 |     Total\n-----------+----------------------------------------------------------------------------------------+----------\n        fr |       189        166        191        216        218        232        232          8 |     1,452 \n        ge |        59         57         61         72         82         83         59          1 |       474 \n        it |        97        130        141        149        137        155        106          3 |       918 \n        po |        70         87        102        172        167        166         86          0 |       850 \n        pt |        48         46         52         79        120        101         57          0 |       503 \n        sw |       167        179        125        175        179        199        183          6 |     1,213 \n        uk |       327        422        413        454        457        479        425         30 |     3,007 \n-----------+----------------------------------------------------------------------------------------+----------\n     Total |       957      1,087      1,085      1,317      1,360      1,415      1,148         48 |     8,417<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">5. What are the mean, standard deviation, and number of observations for employment levels in UK companies? Calculate these statistics separately for US multinationals, other multinationals, and UK domestic firms to replicate column 1 of Table 1.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">gen byte company_type = .\nreplace company_type = 1 if du_usa_mu == 1\nreplace company_type = 2 if du_oth_mu == 1\nreplace company_type = 3 if du_usa_mu == 0 & du_oth_mu == 0\nlabel define company_type_lbl 1 \"US Multinational\" 2 \"Non-US Multinational\" 3 \"UK Domestic\"\nlabel values company_type company_type_lbl\ntabulate company_type\ntabstat employees_a, by(company_type) statistics(mean sd count) columns(statistics)<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. gen byte company_type = .\n(8,417 missing values generated)\n\n. replace company_type = 1 if du_usa_mu == 1\n(919 real changes made)\n\n. replace company_type = 2 if du_oth_mu == 1\n(2,172 real changes made)\n\n. replace company_type = 3 if du_usa_mu == 0 &amp; du_oth_mu == 0\n(5,326 real changes made)\n\n. label define company_type_lbl 1 \"US Multinational\" 2 \"Non-US Multinational\" 3 \"UK Domestic\"\n\n. label values company_type company_type_lbl\n\n. tabulate company_type\n\n        company_type |      Freq.     Percent        Cum.\n---------------------+-----------------------------------\n    US Multinational |        919       10.92       10.92\nNon-US Multinational |      2,172       25.80       36.72\n         UK Domestic |      5,326       63.28      100.00\n---------------------+-----------------------------------\n               Total |      8,417      100.00\n\n. tabstat employees_a, by(company_type) statistics(mean sd count) columns(statistics)\n\nSummary for variables: employees_a\n     by categories of: company_type \n\n    company_type |      mean        sd         N\n-----------------+------------------------------\nUS Multinational |  495.2688  645.1402       919\nNon-US Multinati |   428.785  509.3583      2172\n     UK Domestic |  417.6536   650.362      5326\n-----------------+------------------------------\n           Total |  429.0004  616.8552      8417\n------------------------------------------------\n<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">6. Find the average management score for each country and year.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">preserve\ncollapse (mean) avg_management=management, by(cty year)\nlist\nrestore<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. preserve\n\n. collapse (mean) avg_management=management, by(cty year)\n\n. list\n\n     +------------------------+\n     | cty   year   avg_man~t |\n     |------------------------|\n  1. |  fr   1999    .0334847 |\n  2. |  fr   2000    .0048502 |\n  3. |  fr   2001    .0730181 |\n  4. |  fr   2002    .0875922 |\n  5. |  fr   2003    .0530531 |\n     |------------------------|\n  6. |  fr   2004    .0823055 |\n  7. |  fr   2005    .1030681 |\n  8. |  fr   2006   -.0640106 |\n  9. |  ge   1999    .4130789 |\n 10. |  ge   2000    .3726625 |\n     |------------------------|\n 11. |  ge   2001    .4496971 |\n 12. |  ge   2002    .3818938 |\n 13. |  ge   2003    .4633776 |\n 14. |  ge   2004    .4582495 |\n 15. |  ge   2005    .4307467 |\n     |------------------------|\n 16. |  ge   2006    .6042405 |\n 17. |  it   1999    .0388724 |\n 18. |  it   2000    .0418419 |\n 19. |  it   2001    .0179532 |\n 20. |  it   2002    .0626969 |\n     |------------------------|\n 21. |  it   2003    .0075816 |\n 22. |  it   2004    .0139907 |\n 23. |  it   2005    .0230397 |\n 24. |  it   2006     .661514 |\n 25. |  po   1999   -.0150875 |\n     |------------------------|\n 26. |  po   2000    .1379714 |\n 27. |  po   2001      .08615 |\n 28. |  po   2002    .0266212 |\n 29. |  po   2003   -.1027957 |\n 30. |  po   2004   -.0598225 |\n     |------------------------|\n 31. |  po   2005   -.0460614 |\n 32. |  pt   1999   -.1548034 |\n 33. |  pt   2000   -.1954222 |\n 34. |  pt   2001   -.3220826 |\n 35. |  pt   2002   -.3969625 |\n     |------------------------|\n 36. |  pt   2003   -.3443317 |\n 37. |  pt   2004   -.3615427 |\n 38. |  pt   2005   -.5732161 |\n 39. |  sw   1999    .3488918 |\n 40. |  sw   2000    .3164622 |\n     |------------------------|\n 41. |  sw   2001    .3725764 |\n 42. |  sw   2002     .336567 |\n 43. |  sw   2003    .2955157 |\n 44. |  sw   2004    .3070801 |\n 45. |  sw   2005    .2999685 |\n     |------------------------|\n 46. |  sw   2006   -.4343244 |\n 47. |  uk   1999    .0758213 |\n 48. |  uk   2000    .0747772 |\n 49. |  uk   2001    .1105612 |\n 50. |  uk   2002    .0898244 |\n     |------------------------|\n 51. |  uk   2003    .0839255 |\n 52. |  uk   2004     .069181 |\n 53. |  uk   2005    .0531464 |\n 54. |  uk   2006   -.0244585 |\n     +------------------------+\n\n. restore\n<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">7. Create a horizontal bar chart showing the average &#8220;people management&#8221; score for each country, replicating Figure 3a from the paper.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">preserve\ncollapse (mean) avg_peeps=peeps, by(cty)\ngen sort_order = -avg_peeps\ngraph hbar avg_peeps, over(cty, sort(sort_order)) title(\"Average People Management Scores by Country\") ylabel(, angle(0)) scheme(s1color)\ngraph export \"Average_People_Management_by_Country.png\", width(800) replace\nrestore\n<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. preserve\n\n. collapse (mean) avg_peeps=peeps, by(cty)\n\n. gen sort_order = -avg_peeps\n\n. graph hbar avg_peeps, over(cty, sort(sort_order)) title(\"Average People Manag\n> ement Scores by Country\") ylabel(, angle(0)) scheme(s1color)\n \n. graph export \"Average_People_Management_by_Country.png\", width(800) replace\n(file Average_People_Management_by_Country.png written in PNG format)\n\n. restore\n<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"2376\" height=\"1728\" src=\"https:\/\/www.yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-1.png\" alt=\"\" class=\"wp-image-1939\" srcset=\"https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-1.png 2376w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-1-300x218.png 300w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-1-1024x745.png 1024w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-1-768x559.png 768w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-1-1536x1117.png 1536w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-1-2048x1489.png 2048w\" sizes=\"(max-width: 2376px) 100vw, 2376px\" \/><\/figure>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">8. Repeat the same chart but include only US multinational subsidiaries.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">preserve\nkeep if company_type == 1\ncollapse (mean) avg_peeps=peeps, by(cty)\ngen sort_order = -avg_peeps\ngraph hbar avg_peeps, over(cty, sort(sort_order)) title(\"Average People Management Scores by Country (US Multinationals)\") ylabel(, angle(0)) scheme(s1color)\ngraph export \"Average_People_Management_US_Multinationals.png\", width(800) replace\nrestore<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. preserve\n\n. keep if company_type == 1\n(7,498 observations deleted)\n\n. collapse (mean) avg_peeps=peeps, by(cty)\n\n. gen sort_order = -avg_peeps\n\n. graph hbar avg_peeps, over(cty, sort(sort_order)) title(\"Average People Manag\n> ement Scores by Country (US Multinationals)\") ylabel(, angle(0)) scheme(s1col\n> or)\n\n. \n. graph export \"Average_People_Management_US_Multinationals.png\", width(800) re\n> place\n(file Average_People_Management_US_Multinationals.png written in PNG format)\n\n. \n. restore\n<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"2343\" height=\"1704\" src=\"https:\/\/www.yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-2.png\" alt=\"\" class=\"wp-image-1940\" srcset=\"https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-2.png 2343w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-2-300x218.png 300w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-2-1024x745.png 1024w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-2-768x559.png 768w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-2-1536x1117.png 1536w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-2-2048x1489.png 2048w\" sizes=\"(max-width: 2343px) 100vw, 2343px\" \/><\/figure>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">9. Generate a variable equal to the total working hours of the company.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">gen total_hours = employees_a * hours_t<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. gen total_hours = employees_a * hours_t\n(725 missing values generated)\n<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">10. List the top 10 observations to verify whether your new variable is correctly defined.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">list company_code cty year employees_a hours_t total_hours in 1\/10<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. list company_code cty year employees_a hours_t total_hours in 1\/10\n\n     +-------------------------------------------------------+\n     | compa~de   cty   year   employ~a   hours_t   total_~s |\n     |-------------------------------------------------------|\n  1. |        3    ge   2001        465      4176    1941840 |\n  2. |        3    ge   2002        526      4176    2196576 |\n  3. |        4    ge   2001       2113      3920    8282960 |\n  4. |        4    ge   2002       1996      3920    7824320 |\n  5. |        4    ge   2003       1853      3920    7263760 |\n     |-------------------------------------------------------|\n  6. |        4    ge   2004       1888      3920    7400960 |\n  7. |        5    ge   2001       2261         .          . |\n  8. |        5    ge   2002       2273         .          . |\n  9. |        5    ge   2003       2336         .          . |\n 10. |        5    ge   2004       2518         .          . |\n     +-------------------------------------------------------+<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">10+. Drop the variable you defined just before<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">drop total_hours<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. drop total_hours\n<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">11. Create a dummy variable (0\/1) where the value is 1 if the company has at least one union member and 0 otherwise. (Hint: Use <code>generate<\/code>, <code>replace<\/code>, and <code>if<\/code> together.)<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">generate union_dummy = 0\nreplace union_dummy = 1 if union > 0<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. generate union_dummy = 0\n\n. replace union_dummy = 1 if union > 0\n(6,294 real changes made)<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">12. Rename the management score variable to start with a common prefix, such as <code>m_peeps<\/code>.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">rename peeps m_peeps<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. rename peeps m_peeps\n<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">13. Create a variable representing the total sum of all individual management scores. Compare this to the existing variable <code>management<\/code>. Why are they different? Explain the discrepancy and adjust the formula until the two variables match.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<p class=\"is-style-big_icon_caution\"><span class=\"swl-marker mark_orange\">This result is still not correct.<\/span> But the teacher said it is okay.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">drop if missing(m_peeps, monitoring, operations, targets)\nforeach var of varlist m_peeps monitoring operations targets {\n    summarize `var'\n    scalar mean_`var' = r(mean)\n    scalar sd_`var'   = r(sd)\n    gen double `var'_z2 = (`var' - mean_`var') \/ sd_`var'\n}\negen management_sum_avg2 = rowmean(m_peeps_z2 monitoring_z2 operations_z2 targets_z2)\nsummarize management_sum_avg2\nscalar mean_m2 = r(mean)\nscalar sd_m2   = r(sd)\ngen management_sum_z2 = (management_sum_avg2 - mean_m2) \/ sd_m2\nsummarize management_sum_z2 management\ncorrelate management_sum_z2 management<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">.  drop if missing(m_peeps, monitoring, operations, targets)\n(7 observations deleted)\n\n. foreach var of varlist m_peeps monitoring operations targets {\n  2. \n.     summarize `var'\n  3. \n.     scalar mean_`var' = r(mean)\n  4. \n.     scalar sd_`var'   = r(sd)\n  5. \n.     gen double `var'_z2 = (`var' - mean_`var') \/ sd_`var'\n  6. \n. }\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\n     m_peeps |      8,410   -.0188083     .706219  -1.693648   2.087268\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\n  monitoring |      8,410    .0671427    1.008082  -2.976081   2.434706\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\n  operations |      8,410     .138289    1.011542   -2.05452   2.352676\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\n     targets |      8,410    .1176681    1.012729  -2.619704   2.805768\n\n. egen management_sum_avg2 = rowmean(m_peeps_z2 monitoring_z2 operations_z2 targets_z2)\n\n. summarize management_sum_avg2\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\nmanagement~2 |      8,410    3.35e-10    .8171414  -2.406625   2.252212\n\n. scalar mean_m2 = r(mean)\n\n. scalar sd_m2   = r(sd)\n\n. gen management_sum_z2 = (management_sum_avg2 - mean_m2) \/ sd_m2\n\n. summarize management_sum_z2 management\n\n    Variable |        Obs        Mean    Std. Dev.       Min        Max\n-------------+---------------------------------------------------------\nmanagemen~z2 |      8,410   -1.81e-11           1  -2.945175   2.756208\n  management |      8,410    .0921417    1.012059  -3.019884   2.841167\n\n. correlate management_sum_z2 management\n(obs=8,410)\n\n             | manag~z2 manage~t\n-------------+------------------\nmanagemen~z2 |   1.0000\n  management |   0.9872   1.0000\n\n. scatter management_sum_z2 management\n\n. correlate management_sum_avg2 management\n(obs=8,410)\n\n             | manag~g2 manage~t\n-------------+------------------\nmanagemen~g2 |   1.0000\n  management |   0.9872   1.0000\n<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">14. Perform a regression analysis of <code>log(sales)<\/code> on <code>log(employees in the firm)<\/code>.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">regress ly lemp<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. regress ly lemp\n\n      Source |       SS           df       MS      Number of obs   =     8,417\n-------------+----------------------------------   F(1, 8415)      =     26.46\n       Model |  16.4187464         1  16.4187464   Prob > F        =    0.0000\n    Residual |  5221.37628     8,415  .620484407   R-squared       =    0.0031\n-------------+----------------------------------   Adj R-squared   =    0.0030\n       Total |  5237.79503     8,416  .622361577   Root MSE        =    .78771\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        lemp |   .0495207   .0096268     5.14   0.000     .0306498    .0683916\n       _cons |   4.822661   .0545029    88.48   0.000     4.715822      4.9295\n------------------------------------------------------------------------------<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">15. Predict fitted values and create a chart plotting a scatterplot of <code>log(sales)<\/code> against <code>log(employees in the firm)<\/code> with a fitted line.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">predict fitted_ly\ntwoway (scatter ly lemp) (line fitted_ly lemp), title(\"Log(Sales) vs Log(Employees) with Fit Line\")  legend(order(1 \"Actual\" 2 \"Fitted\")) xlabel(, angle(vertical)) ylabel(, angle(horizontal)) scheme(s1color)\ngraph export \"LogSales_vs_LogEmployees_with_FitLine.png\", width(800) replace<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. predict fitted_ly\n(option xb assumed; fitted values)\n\n. twoway (scatter ly lemp) (line fitted_ly lemp), title(\"Log(Sales) vs Log(Employees) with Fit Line\")  legend(order(1 \"Actual\" 2 \"Fitted\")) xlabel(, angl\n> e(vertical)) ylabel(, angle(horizontal)) scheme(s1color)\n\n. graph export \"LogSales_vs_LogEmployees_with_FitLine.png\", width(800) replace\n(file LogSales_vs_LogEmployees_with_FitLine.png written in PNG format)\n<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"2376\" height=\"1728\" src=\"https:\/\/www.yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-3.png\" alt=\"\" class=\"wp-image-1941\" srcset=\"https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-3.png 2376w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-3-300x218.png 300w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-3-1024x745.png 1024w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-3-768x559.png 768w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-3-1536x1117.png 1536w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-3-2048x1489.png 2048w\" sizes=\"(max-width: 2376px) 100vw, 2376px\" \/><\/figure>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">16. Repeat the same regression, but this time limit the sample to:<br>i) UK domestic companies,<br>ii) US multinational companies,<br>iii) other multinational companies.<br>Plot three separate fitted lines on the scatterplot.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">regress ly lemp if company_type == 3\npredict fitted_uk if company_type == 3\nregress ly lemp if company_type == 1\npredict fitted_us if company_type == 1\nregress ly lemp if company_type == 2\npredict fitted_oth if company_type == 2\ntwoway (scatter ly lemp if company_type == 3, mcolor(eltblue) msymbol(Oh) msize(small) legend(label(1 \"UK Domestic (scatter)\")))(line fitted_uk lemp if company_type == 3, lcolor(blue) lwidth(medium) legend(label(2 \"UK Domestic (line)\")))(scatter ly lemp if company_type == 1, mcolor(pink) msymbol(Oh) msize(small) legend(label(3 \"US Multinational (scatter)\")))(line fitted_us lemp if company_type == 1, lcolor(red) lwidth(medium) legend(label(4 \"US Multinational (line)\")))(scatter ly lemp if company_type == 2, mcolor(olive_teal) msymbol(Oh) msize(small) legend(label(5 \"Other Multinational (scatter)\")))(line fitted_oth lemp if company_type == 2, lcolor(lime) lwidth(medium) legend(label(6 \"Other Multinational (line)\"))),title(\"Log(sales) vs. Log(employees) by Company Type\") xtitle(\"Log(employees in the firm)\") ytitle(\"Log(sales)\") legend(order(1 2 3 4 5 6) region(style(none)) position(6) col(2) size(small))<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. regress ly lemp if company_type == 3\n\n      Source |       SS           df       MS      Number of obs   =     5,326\n-------------+----------------------------------   F(1, 5324)      =     13.36\n       Model |  8.28050094         1  8.28050094   Prob > F        =    0.0003\n    Residual |  3299.94842     5,324  .619825023   R-squared       =    0.0025\n-------------+----------------------------------   Adj R-squared   =    0.0023\n       Total |  3308.22892     5,325  .621263648   Root MSE        =    .78729\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        lemp |   .0436063   .0119304     3.66   0.000     .0202178    .0669948\n       _cons |   4.742818   .0669286    70.86   0.000      4.61161    4.874025\n------------------------------------------------------------------------------\n\n. \n. predict fitted_uk if company_type == 3\n(option xb assumed; fitted values)\n(3,091 missing values generated)\n\n. \n. regress ly lemp if company_type == 1\n\n      Source |       SS           df       MS      Number of obs   =       919\n-------------+----------------------------------   F(1, 917)       =     16.89\n       Model |  8.24884946         1  8.24884946   Prob > F        =    0.0000\n    Residual |  447.783833       917  .488313885   R-squared       =    0.0181\n-------------+----------------------------------   Adj R-squared   =    0.0170\n       Total |  456.032682       918  .496767628   Root MSE        =    .69879\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        lemp |   -.105811   .0257444    -4.11   0.000    -.1563359   -.0552861\n       _cons |   5.963486   .1497717    39.82   0.000     5.669551    6.257421\n------------------------------------------------------------------------------\n\n. \n. predict fitted_us if company_type == 1\n(option xb assumed; fitted values)\n(7,498 missing values generated)\n\n. \n. regress ly lemp if company_type == 2\n\n      Source |       SS           df       MS      Number of obs   =     2,172\n-------------+----------------------------------   F(1, 2170)      =     16.94\n       Model |  9.88791441         1  9.88791441   Prob > F        =    0.0000\n    Residual |  1266.64783     2,170  .583708676   R-squared       =    0.0077\n-------------+----------------------------------   Adj R-squared   =    0.0073\n       Total |  1276.53574     2,171  .587994353   Root MSE        =    .76401\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        lemp |   .0797256   .0193706     4.12   0.000     .0417387    .1177124\n       _cons |   4.822956   .1108085    43.53   0.000     4.605654    5.040258\n------------------------------------------------------------------------------\n\n. \n. predict fitted_oth if company_type == 2\n(option xb assumed; fitted values)\n(6,245 missing values generated)\n\n. \n. twoway (scatter ly lemp if company_type == 3, mcolor(eltblue) msymbol(Oh) msize(small) legend(label(1 \"UK Domestic (scatter)\"\n> )))(line fitted_uk lemp if company_type == 3, lcolor(blue) lwidth(medium) legend(label(2 \"UK Domestic (line)\")))(scatter ly l\n> emp if company_type == 1, mcolor(pink) msymbol(Oh) msize(small) legend(label(3 \"US Multinational (scatter)\")))(line fitted_us\n>  lemp if company_type == 1, lcolor(red) lwidth(medium) legend(label(4 \"US Multinational (line)\")))(scatter ly lemp if company\n> _type == 2, mcolor(olive_teal) msymbol(Oh) msize(small) legend(label(5 \"Other Multinational (scatter)\")))(line fitted_oth lem\n> p if company_type == 2, lcolor(lime) lwidth(medium) legend(label(6 \"Other Multinational (line)\"))),title(\"Log(sales) vs. Log(\n> employees) by Company Type\") xtitle(\"Log(employees in the firm)\") ytitle(\"Log(sales)\") legend(order(1 2 3 4 5 6) region(style\n> (none)) position(6) col(2) size(small))\n\n<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"2376\" height=\"1728\" src=\"https:\/\/www.yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-4-1.png\" alt=\"\" class=\"wp-image-1951\" srcset=\"https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-4-1.png 2376w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-4-1-300x218.png 300w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-4-1-1024x745.png 1024w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-4-1-768x559.png 768w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-4-1-1536x1117.png 1536w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-4-1-2048x1489.png 2048w\" sizes=\"(max-width: 2376px) 100vw, 2376px\" \/><\/figure>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">17. Rank companies based on year, country, and management score.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">bysort year cty (management): gen rank = _N - _n +1\nlist company_code cty year management rank in 1\/10<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. bysort year cty (management): gen rank = _N - _n +1\n\n. list company_code cty year management rank in 1\/10\n\n     +------------------------------------------+\n     | compa~de   cty   year   managem~t   rank |\n     |------------------------------------------|\n  1. |      207    fr   1999     -2.2934    189 |\n  2. |      237    fr   1999   -2.289897    188 |\n  3. |      241    fr   1999   -2.118449    187 |\n  4. |      140    fr   1999   -2.019981    186 |\n  5. |      313    fr   1999   -1.942747    185 |\n     |------------------------------------------|\n  6. |      398    fr   1999   -1.869219    184 |\n  7. |      158    fr   1999   -1.860743    183 |\n  8. |      389    fr   1999   -1.835789    182 |\n  9. |      402    fr   1999   -1.780855    181 |\n 10. |      338    fr   1999   -1.761644    180 |\n     +------------------------------------------+<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">17+. Generate a variable\u00a0<code>nobs<\/code>\u00a0to represent the number of observations for each company.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">bysort company_code: egen nobs = count(company_code)<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. bysort company_code: egen nobs = count(company_code)\n<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">17++. Create a scatter plot of management scores and sales using only 10% of the observations for each country and year (randomly selected observations).<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">sort cty year\nset seed 12345\nby cty year: gen double rnd = runiform()\ngen byte pick = (rnd < 0.1)\ntwoway (scatter management ly if pick == 1)\n<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. sort cty year\n\n. set seed 12345\n\n. by cty year: gen double rnd = runiform()\n\n. gen byte pick = (rnd < 0.1)\n\n. twoway (scatter management ly if pick == 1)\n<\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"2343\" height=\"1704\" src=\"https:\/\/www.yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-5.png\" alt=\"\" class=\"wp-image-1969\" srcset=\"https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-5.png 2343w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-5-300x218.png 300w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-5-1024x745.png 1024w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-5-768x559.png 768w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-5-1536x1117.png 1536w, https:\/\/yanagichiaki.jp\/wp-content\/uploads\/2024\/12\/Introductory-Eco-Lab3-2-5-2048x1489.png 2048w\" sizes=\"(max-width: 2343px) 100vw, 2343px\" \/><\/figure>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">18. Regress <code>log(sales)<\/code> on <code>log(materials)<\/code>, <code>log(employment)<\/code>, and <code>log(capital)<\/code>.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">regress ly lmat lemp lcap<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. regress ly lmat lemp lcap\n\n      Source |       SS           df       MS      Number of obs   =     4,227\n-------------+----------------------------------   F(3, 4223)      =   4983.44\n       Model |  2315.18686         3  771.728954   Prob > F        =    0.0000\n    Residual |  653.968595     4,223  .154858772   R-squared       =    0.7797\n-------------+----------------------------------   Adj R-squared   =    0.7796\n       Total |  2969.15546     4,226  .702592394   Root MSE        =    .39352\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        lmat |   .6332863   .0062647   101.09   0.000     .6210043    .6455683\n        lemp |   .0013348   .0067869     0.20   0.844     -.011971    .0146407\n        lcap |   .1230179   .0063333    19.42   0.000     .1106013    .1354345\n       _cons |   2.025731   .0445252    45.50   0.000     1.938438    2.113024\n------------------------------------------------------------------------------<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">19. Predict residuals and replace them with their squared values.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">cd Lab3\nuse replicate.dta, clear<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. predict residuals, residuals\n(4,190 missing values generated)\n\n. gen residuals_sq = residuals^2\n(4,190 missing values generated)\n<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">20. Perform a regression of <code>log(sales)<\/code> on <code>log(materials)<\/code>, <code>log(employment)<\/code>, <code>log(capital)<\/code>, and <code>management<\/code> for each country in the sample (use a loop).<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">levelsof cty, local(countries) \nforeach country of local countries {\n    display \"Running regression for country: `country'\"\n    count if cty == \"`country'\" & !missing(ly, lmat, lemp, lcap, management)\n    if r(N) > 0 regress ly lmat lemp lcap management if cty == \"`country'\"\n    else display \"Skipping country: `country' (insufficient non-missing observations)\"\n}\n<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. levelsof cty, local(countries) \n`\"fr\"' `\"ge\"' `\"it\"' `\"po\"' `\"pt\"' `\"sw\"' `\"uk\"'\n\n. foreach country of local countries {\n  2. \n.     display \"Running regression for country: `country'\"\n  3. \n.     count if cty == \"`country'\" & !missing(ly, lmat, lemp, lcap, management)\n  4. \n.     if r(N) > 0 regress ly lmat lemp lcap management if cty == \"`country'\"\n  5. \n.     else display \"Skipping country: `country' (insufficient non-missing observations)\"\n  6. \n. }\nRunning regression for country: fr\n  1,426\n\n      Source |       SS           df       MS      Number of obs   =     1,426\n-------------+----------------------------------   F(4, 1421)      =    756.57\n       Model |  405.686127         4  101.421532   Prob > F        =    0.0000\n    Residual |   190.49255     1,421  .134055278   R-squared       =    0.6805\n-------------+----------------------------------   Adj R-squared   =    0.6796\n       Total |  596.178678     1,425  .418371002   Root MSE        =    .36614\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        lmat |   .5123901    .011429    44.83   0.000     .4899707    .5348096\n        lemp |   .0229771   .0131041     1.75   0.080    -.0027284    .0486826\n        lcap |   .1112761   .0109944    10.12   0.000      .089709    .1328432\n  management |   .0304229   .0109067     2.79   0.005      .009028    .0518178\n       _cons |   2.634434   .0884672    29.78   0.000     2.460894    2.807975\n------------------------------------------------------------------------------\nRunning regression for country: ge\n  375\n\n      Source |       SS           df       MS      Number of obs   =       375\n-------------+----------------------------------   F(4, 370)       =    394.90\n       Model |  96.6379666         4  24.1594916   Prob > F        =    0.0000\n    Residual |  22.6362051       370  .061178933   R-squared       =    0.8102\n-------------+----------------------------------   Adj R-squared   =    0.8082\n       Total |  119.274172       374  .318914898   Root MSE        =    .24734\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        lmat |   .5343757   .0154031    34.69   0.000     .5040871    .5646644\n        lemp |  -.0816476   .0137308    -5.95   0.000    -.1086478   -.0546474\n        lcap |   .1213607   .0150875     8.04   0.000     .0916927    .1510286\n  management |   .0246979    .015515     1.59   0.112    -.0058106    .0552065\n       _cons |   3.044236   .1334112    22.82   0.000     2.781897    3.306575\n------------------------------------------------------------------------------\nRunning regression for country: it\n  905\n\n      Source |       SS           df       MS      Number of obs   =       905\n-------------+----------------------------------   F(4, 900)       =   1025.79\n       Model |  268.180953         4  67.0452382   Prob > F        =    0.0000\n    Residual |  58.8238441       900  .065359827   R-squared       =    0.8201\n-------------+----------------------------------   Adj R-squared   =    0.8193\n       Total |  327.004797       904   .36173097   Root MSE        =    .25566\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        lmat |   .5709453   .0104092    54.85   0.000     .5505162    .5913744\n        lemp |    -.07023   .0105837    -6.64   0.000    -.0910016   -.0494584\n        lcap |   .0976345   .0096953    10.07   0.000     .0786065    .1166625\n  management |   .0267954   .0082875     3.23   0.001     .0105304    .0430603\n       _cons |   2.802493   .0753308    37.20   0.000     2.654648    2.950337\n------------------------------------------------------------------------------\nRunning regression for country: po\n  562\n\n      Source |       SS           df       MS      Number of obs   =       562\n-------------+----------------------------------   F(4, 557)       =    423.87\n       Model |  374.045352         4  93.5113379   Prob > F        =    0.0000\n    Residual |  122.882657       557  .220615183   R-squared       =    0.7527\n-------------+----------------------------------   Adj R-squared   =    0.7509\n       Total |  496.928009       561  .885789676   Root MSE        =     .4697\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        lmat |   .4788854   .0185862    25.77   0.000     .4423779     .515393\n        lemp |  -.1240909   .0253473    -4.90   0.000    -.1738788    -.074303\n        lcap |   .2397321   .0206813    11.59   0.000     .1991092    .2803549\n  management |   .0801802    .021518     3.73   0.000     .0379138    .1224466\n       _cons |   2.534757   .1582044    16.02   0.000     2.224007    2.845507\n------------------------------------------------------------------------------\nRunning regression for country: pt\n  463\n\n      Source |       SS           df       MS      Number of obs   =       463\n-------------+----------------------------------   F(4, 458)       =    468.44\n       Model |  218.307087         4  54.5767718   Prob > F        =    0.0000\n    Residual |  53.3608483       458  .116508402   R-squared       =    0.8036\n-------------+----------------------------------   Adj R-squared   =    0.8019\n       Total |  271.667936       462  .588025835   Root MSE        =    .34133\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        lmat |   .6016575   .0177263    33.94   0.000     .5668226    .6364924\n        lemp |  -.0094647   .0200062    -0.47   0.636      -.04878    .0298507\n        lcap |   .1247086   .0198622     6.28   0.000     .0856763    .1637409\n  management |   .0672758   .0167721     4.01   0.000     .0343161    .1002356\n       _cons |   2.057761   .1312337    15.68   0.000     1.799866    2.315656\n------------------------------------------------------------------------------\nRunning regression for country: sw\n  496\n\n      Source |       SS           df       MS      Number of obs   =       496\n-------------+----------------------------------   F(4, 491)       =    603.13\n       Model |   148.60951         4  37.1523775   Prob > F        =    0.0000\n    Residual |  30.2451752       491  .061599135   R-squared       =    0.8309\n-------------+----------------------------------   Adj R-squared   =    0.8295\n       Total |  178.854685       495  .361322596   Root MSE        =    .24819\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n        lmat |   .6121036   .0158239    38.68   0.000     .5810127    .6431945\n        lemp |   .0455529   .0138143     3.30   0.001     .0184105    .0726953\n        lcap |    .111258   .0121415     9.16   0.000     .0874024    .1351137\n  management |  -.0211386   .0127182    -1.66   0.097    -.0461275    .0038502\n       _cons |   1.941344   .0867648    22.37   0.000     1.770868     2.11182\n------------------------------------------------------------------------------\nRunning regression for country: uk\n  0\nSkipping country: uk (insufficient non-missing observations)\n\n<\/code><\/pre>\n<\/div><\/details>\n\n\n\n<details class=\"swell-block-accordion__item\" data-swl-acc=\"wrapper\"><summary class=\"swell-block-accordion__title\" data-swl-acc=\"header\"><span class=\"swell-block-accordion__label\">20+. Test whether the coefficient of\u00a0<code>management<\/code>\u00a0is equal to 0.03 if it is statistically significant. If it is not significant, test whether it equals 0.03.<\/span><span class=\"swell-block-accordion__icon c-switchIconBtn\" data-swl-acc=\"icon\" aria-hidden=\"true\" data-opened=\"false\"><i class=\"__icon--closed icon-caret-down\"><\/i><i class=\"__icon--opened icon-caret-up\"><\/i><\/span><\/summary><div class=\"swell-block-accordion__body\" data-swl-acc=\"body\">\n<pre class=\"wp-block-code\"><code class=\"stata\">reg ly management\ntest _b[management] = 0.03\n<\/code><\/pre>\n\n\n\n<p>The output:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code class=\"stata\">. reg ly management\n\n      Source |       SS           df       MS      Number of obs   =     8,417\n-------------+----------------------------------   F(1, 8415)      =    341.64\n       Model |  204.355239         1  204.355239   Prob > F        =    0.0000\n    Residual |  5033.43979     8,415  .598150896   R-squared       =    0.0390\n-------------+----------------------------------   Adj R-squared   =    0.0389\n       Total |  5237.79503     8,416  .622361577   Root MSE        =     .7734\n\n------------------------------------------------------------------------------\n          ly |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n-------------+----------------------------------------------------------------\n  management |   .1539264   .0083277    18.48   0.000     .1376021    .1702508\n       _cons |    5.08551    .008464   600.84   0.000     5.068919    5.102102\n------------------------------------------------------------------------------\n\n. test _b[management] = 0.03\n\n ( 1)  management = .03\n\n       F(  1,  8415) =  221.45\n            Prob > F =    0.0000\n<\/code><\/pre>\n\n\n\n<p>The coefficient of\u00a0<code>management<\/code>\u00a0is\u00a0<strong>0.1539<\/strong>, with a standard error of\u00a0<strong>0.0083<\/strong>. The t-value for\u00a0<code>management<\/code>\u00a0is\u00a0<strong>18.48<\/strong>, and the p-value is\u00a0<strong>0.000<\/strong>, indicating that the coefficient is highly statistically significant.<\/p>\n\n\n\n<p>The 95% confidence interval for the\u00a0<code>management<\/code>\u00a0coefficient is\u00a0<strong>[0.1376, 0.1703]<\/strong>, which does not include 0.03. Here, the null hypothesis\u00a0$H_0:\\beta_{management}=0.03$\u00a0was tested. The F-statistic for this test is\u00a0<strong>221.45<\/strong>, with a p-value of\u00a0<strong>0.000<\/strong>. Since the p-value is extremely small, we reject the null hypothesis.<\/p>\n\n\n\n<p>The coefficient of\u00a0<code>management<\/code>\u00a0is <span class=\"swl-marker mark_yellow\">statistically significant<\/span>, and it is\u00a0<span class=\"swl-marker mark_yellow\">not equal to 0.03<\/span>. The observed coefficient of 0.1539 is substantially larger than 0.03, as supported by the hypothesis test and confidence interval.<\/p>\n<\/div><\/details>\n<\/div>\n\n\n\n<p class=\"has-text-align-center has-border -border03 is-style-kakko_box\">\u4ee5\u4e0a\u3067\u3059<\/p>\n\n\n\n\n","protected":false},"excerpt":{"rendered":"<p>\u30cf\u30eb\u30d3\u30f3\u5de5\u696d\u5927\u5b66\uff08\u6df1\u5733\uff09\u2022 2024 \u2022 \u5165\u9580\u8a08\u91cf\u7d4c\u6e08\u5b66 Homework &#038; Lab \u2022 HITSZ \u54c8\u5c14\u6ee8\u5de5\u4e1a\u5927\u5b66\uff08\u6df1\u5733\uff09 \u57fa\u7840\u8ba1\u91cf\u7ecf\u6d4e\u5b66\u4f5c\u4e1a \u2022 \u5b9e\u9a8c 2024<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"swell_btn_cv_data":"","_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[8],"tags":[],"class_list":["post-1855","post","type-post","status-publish","format-standard","hentry","category-learninginhitsz"],"jetpack_sharing_enabled":true,"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/yanagichiaki.jp\/index.php\/wp-json\/wp\/v2\/posts\/1855","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/yanagichiaki.jp\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/yanagichiaki.jp\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/yanagichiaki.jp\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/yanagichiaki.jp\/index.php\/wp-json\/wp\/v2\/comments?post=1855"}],"version-history":[{"count":89,"href":"https:\/\/yanagichiaki.jp\/index.php\/wp-json\/wp\/v2\/posts\/1855\/revisions"}],"predecessor-version":[{"id":1973,"href":"https:\/\/yanagichiaki.jp\/index.php\/wp-json\/wp\/v2\/posts\/1855\/revisions\/1973"}],"wp:attachment":[{"href":"https:\/\/yanagichiaki.jp\/index.php\/wp-json\/wp\/v2\/media?parent=1855"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/yanagichiaki.jp\/index.php\/wp-json\/wp\/v2\/categories?post=1855"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/yanagichiaki.jp\/index.php\/wp-json\/wp\/v2\/tags?post=1855"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}