Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add support for reading 102-format Stata dta files #58978

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cmjcharlton
Copy link
Contributor

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

This would complete support for reading all historic Stata dta format versions.

I would understand if you chose not to merge this as:

  • No formal documentation exists for this version, so I have had to infer the details from later formats and the Stata 1 user manual.
  • Unlike all the other version formats I have not been able to locate any sample data written in this version (and hence I haven't created a linked issue).

Having said that, I am reasonably confident that the changes are correct, and Stata is happy to open and view the test data that I created:

. dtaversion "stata-compat-102.dta"
  (file "stata-compat-102.dta" is
   .dta-format 102 from Stata 1)
. use "stata-compat-102.dta"
. describe

Contains data from stata-compat-102.dta
 Observations:             3                  
    Variables:             7                  
-------------------------------------------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------------------------------------------
index           long    %12.0g                
i8              int     %8.0g                 
i16             int     %8.0g                 
i32             long    %12.0g                
f               float   %9.0g                 
d               double  %10.0g                
dt              double  %10.0g                
-------------------------------------------------------------------------------------------------------------------
Sorted by:

. list

     +--------------------------------------------------+
     | index   i8     i16        i32     f    d      dt |
     |--------------------------------------------------|
  1. |     1   -1   -1025   -8388609   -.1   .1   14610 |
  2. |     2    0       0          0   -.2   .2   14611 |
  3. |     3    1    1025    8388609   -.3   .3   14612 |
     +--------------------------------------------------+
. dtaversion "stata4_102.dta"
  (file "stata4_102.dta" is
   .dta-format 102 from Stata 1)
. use "stata4_102.dta"
. describe

Contains data from stata4_102.dta
 Observations:            10                  
    Variables:             5                  
-------------------------------------------------------------------------------------------------------------------
Variable      Storage   Display    Value
    name         type    format    label      Variable label
-------------------------------------------------------------------------------------------------------------------
fulllab         int     %8.0g      full_lbl   A fully labeled variable.
fulllab2        float   %9.0g      full_lbl   Another fully labeled variable.
incmplab        long    %12.0g     incp_lbl   Some values without labels.
misslab         int     %8.0g      miss_lbl   Some missing value labels.
floatlab        float   %9.0g      full_lbl   Floating point with labels.
-------------------------------------------------------------------------------------------------------------------
Sorted by:

. list

     +----------------------------------------------------+
     | fulllab   fulllab2   incmplab   misslab   floatlab |
     |----------------------------------------------------|
  1. |     one        ten        one       one        one |
  2. |     two       nine        two       two        two |
  3. |   three      eight      three     three      three |
  4. |    four      seven          4      four       four |
  5. |    five        six          5         .       five |
     |----------------------------------------------------|
  6. |     six       five          6         .        six |
  7. |   seven       four          7         .      seven |
  8. |   eight      three          8         .      eight |
  9. |    nine        two          9         .       nine |
 10. |     ten        one        ten         .        ten |
     +----------------------------------------------------+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant