Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- meta_data = """1 1 X CH 13 Species code
- 2-3 2 AA CH 4 Breed of evaluation
- Bull's identification information
- 4-5 2 AA CH 4 Breed code (alpha code only, no zeros)
- 6-8 3 AAA CH 119 Country code of ID origin
- 9-20 12 XX...XX CH Identification number (registration or eartag)
- Sire's identification information
- 21-22 2 AA CH 4 Breed code (registration or eartag)
- 23-25 3 AAA CH 119 Country code of ID origin
- 26-37 12 XX...XX CH Identification number (registration or eartag)
- Dam's identification information
- 38-39 2 AA CH 4 Breed code (alpha code only, no zeros)
- 40-42 3 AAA CH 119 Country code of ID origin
- 43-54 12 XX...XX CH Identification number (registration or eartag)
- Maternal grandsire's identification information
- 55-56 2 AA CH 4 Breed code (registration or eartag)
- 57-59 3 AAA CH 119 Country code of ID origin
- 60-71 12 XX...XX CH Identification number (registration or eartag)
- Bull's dual registration identification information
- 72-73 2 AA CH 4 95 Breed code (alpha code only, no zeros)
- 74-76 3 AAA CH 119 95 Country code of ID origin
- 77-88 12 XX...XX CH 95 Identification number (registration or eartag)
- 89-96 8 XX...XX CH 88 Birth date (YYYYMMDD)
- 97-98 2 XX CH 130 145 Registry status code
- 99-128 30 AA...AA CH Registered name
- 129-148 20 AA...AA CH 95 145 Short AI name
- 149-154 6 XXXXXX CH 83 95 145 Date the bull entered AI (YYYYMM)
- 155 1 A CH 87 95 145 Sampling status
- 156-159 4 XXXX CH 87 95 145 Sampling controller number
- 160 1 A CH 35 94 95 145 Current status code
- 161-164 4 XXXX CH 95 145 NAAB bull controller number
- 165 1 X CH 95 145 Number of uniform NAAB sire codes assigned in following positions as well as columns 548-567
- 166-175 10 XXXAAXXXXX CH 95 145 Primary NAAB code
- 176-205 30 XXXAAXXXXX CH 95 145 Secondary NAAB codes (up to 3 additional codes)
- Herd with most daughters
- 206-207 2 XX CH 5 145 State code
- 208-209 2 XX CH 145 County code
- 210-213 4 XXXX CH 145 Herd number
- 214-217 4 XXXX CH 33 95 145 Number of daughters in herd with most daughters
- 218-219 2 XX CH 5 33 145 State with most daughters
- 220-221 2 XX CH 33 95 145 Age at first calving (months)
- 222-224 3 XXX CH 145 Percent daughters with 1st lactation records from management plans (Type of test ≥ 40)
- 225-227 3 XX.X CH 145 Inbreeding coefficient of this bull (%)
- 228-230 3 XX.X CH 145 Average daughter inbreeding percent
- 231-233 3 XX.X CH 145 Expected future inbreeding (%) (EFI)
- 234-235 2 XX CH Reliability of yield (avg. of protein and milk-fat (MF) reliabilities, weighted by current component prices)
- 236-237 2 XX CH 33 95 Reliability of PTA daughter pregnancy rate (DPR)
- 238-242 5 +/-XXXX CSL 33 PTA milk
- 243-244 2 XX CH 33 Reliability of PTA MF
- 245-248 4 +/-XXX CSL 33 95 PTA fat
- 249-251 3 +/-.XX CSL 33 PTA fat percentage
- 252-253 2 XX CH 33 Reliability of PTA protein
- 254-257 4 +/-XXX CSL 33 95 PTA protein
- 258-260 3 +/-.XX CSL 33 PTA protein percent
- 261-262 2 XX CH 95 Reliability of PTA productive life (PL)
- 263-265 3 +/-X.X CSL 95 PTA PL
- 266-267 2 XX CH 95 Reliability of PTA somatic cell score (SCS)
- 268-270 3 X.XX CH 95 PTA SCS
- 271-272 2 XX CH Reliability of net merit dollars (NM$)
- 273-277 5 +/-XXXX CSL 33 Fluid merit dollars (FM$)
- 278-282 5 +/-XXXX CSL 33 NM$
- 283-287 5 +/-XXXX CSL 33 Cheese merit dollars (CM$)
- 288-289 2 AA CH 39 Net merit percentile
- 290-292 3 +/-X.X CSL 95 PTA DPR
- 293 1 A CH 156 Interbull usability code for DPR
- 294-296 3 XXX CH 39 145 Average number of DIM for first-lactation daughters (MF)
- 297-299 3 XXX CH 39 145 Average number of DIM for first-lactation daughters (protein)
- 300-302 3 X.XX CH 39 145 Average age weight of daughters for PL evaluation
- 303-305 3 XXX CH 149 Pedigree completeness %
- 306-308 3 XXX CH 39 145 Percent of daughter first-lactation records that are in progress (MF)
- 309-311 3 XXX CH 39 145 Percent of daughter first-lactation records that are in progress (protein)
- 312-316 5 XXXXX CH 33 95 Number of herds (DPR)
- 317-321 5 XXXXX CH 33 Number of herds (MF)
- 322-326 5 XXXXX CH 33 Number of herds (protein)
- 327-331 5 XXXXX CH 33 95 145 Number of herds (PL)
- 332-336 5 XXXXX CH 33 95 Number of herds (SCS)
- 337-341 5 XXXXX CH 33 95 Number of daughters (DPR)
- 342-346 5 XXXXX CH 33 Number of daughters (MF)
- 347-351 5 XXXXX CH 33 41 Number of daughters (protein)
- 352-356 5 XXXXX CH 95 145 Number of daughters (PL)
- 357-361 5 XXXXX CH 95 Number of daughters (SCS)
- 362 1 X CH 136 Interbull usability code for SCS
- 363 1 X CH 150 Interbull preferred ID code/Clonal evaluation source code
- 364 1 X CH 152 Interbull usability code for PL
- 365-367 3 X.XX CH 33 145 Average number of lactations per daughter (MF)
- 368-370 3 X.XX CH 33 145 Average number of lactations per daughter (protein)
- 371-373 3 XXX CH Heterosis coefficient
- 374-376 3 XXX CH 33 145 Average number of lactations in daughter management group (MF)
- 377-379 3 XXX CH 33 145 Average number of lactations in daughter management group (protein)
- 380-381 2 AA CH 4 Predominate breed for crossbred animals
- 382-384 3 XX.X CH 33 95 145 Average standardized DPR
- 385-389 5 XXXXX CH 33 145 Average standardized milk
- 390-393 4 XXXX CH 33 145 Average standardized fat yield
- 394-395 2 X.X CH 33 145 Average standardized fat percent
- 396-400 5 XXXXX CH 33 145 Average standardized milk (protein)
- 401-404 4 XXXX CH 145 Average standardized protein yield
- 405-406 2 X.X CH 33 145 Average standardized protein percent
- 407-409 3 XX.X CH 33 95 145 Average PL of daughters
- 410-412 3 X.XX CH 95 145 Average standardized SCS
- 413-414 2 XX CH Number of countries in evaluation
- 415-417 3 AAA CH 119 Country with most daughters
- 418-422 5 +/-XXXX CSL 33 145 Daughter yield deviation (DYD) milk PTA milk change (interim summary)
- 423-426 4 +/-XXX CSL 33 95 145 DYD fat PTA fat change (interim summary)
- 427-429 3 +/-.XX CSL 33 95 145 DYD fat percent
- 430-434 5 +/-XXXX CSL 33 145 DYD milk (protein)
- 435-438 4 +/-XXX CSL 33 145 DYD protein PTA protein change (interim summary)
- 439-441 3 +/-.XX CSL 33 95 145 DYD protein percent
- 442-445 4 +/-XX.X CSL 33 95 145 Daughter deviation for PL
- 446-449 4 +/-X.XX CH 95 145 Daughter deviation for SCS
- 450-451 2 XX CH 154 Percentage of predominate breed for crossbred animals
- 452-456 5 +/-XXXX CSL 145 Parent average (PA) milk
- 457-458 2 XX CH 145 Reliability of PA (MF)
- 459-462 4 +/-XXX CSL 95 145 PA fat
- 463-464 2 XX CH 145 Reliability of PA (protein)
- 465-468 4 +/-XXX CSL 95 145 PA protein
- 469-470 2 XX CH 95 145 Reliability of PA (PL)
- 471-473 3 +/-X.X CSL 95 145 PA PL
- 474-475 2 XX CH 95 145 Reliability of PA (SCS)
- 476-478 3 X.XX CH 95 145 PA SCS
- 479-481 3 XXX CH Percent of daughters in the US - Genomic bulls with no daughters are reported as 100% US
- 482 1 X CH 125 Interbull usability code for yield
- 483-484 2 AA CH 145 Herdbook identifier [North American (NA) or international (I-blank)]
- 485 1 A CH Evaluation restriction code (for CDCB and NAAB use only)
- 486-500 15 00...00 CH Zeroes: Available for future use
- 501-504 4 +/-XX.X CSL 95 Daughter deviation for DPR
- 505-507 3 +/-X.X CSL 95 PA DPR
- 508-509 2 XX CH 95 Reliability of PA DPR
- 510-514 5 +/-XXXX CSL PA NM$
- 515-516 2 XX CH 95 Reliability of PA NM$
- 517-520 4 +/-XX.X CSL 95 Sire conception rate (SCR)
- 521-522 2 XX CH Reliability of SCR
- 523-529 7 XX...XX CH 95 Number of breedings for SCR
- Red and White or clonal evaluation source
- identificaion information
- 530-531 2 AA CH 4 Breed code (alpha code only, no zeros)
- 532-534 3 AAA CH 119 Country code of ID origin
- 535-546 12 XX...XX CH Identification number (registration or eartag)
- 547 1 X CH 160 Genomic indicator code
- 548-567 20 XXXAAXXXXX CH 95 145 Continuation of secondary NAAB codes (two codes)
- Heifer conception rate (HCR) information
- 568-571 4 +/-XX.X CSL PTA HCR
- 572-573 2 XX CH Reliability of PTA HCR
- 574-578 5 XXXXX CH Number of herds (HCR)
- 579-584 6 XX...XX CH Number of daughters (HCR)
- 585 1 A CH Interbull usability code for HCR (0 domestic and official, 2 Interbull and official)
- Cow conception rate (CCR) information
- 586-589 4 +/-XX.X CSL PTA CCR
- 590-591 2 XX CH Reliability of PTA CCR
- 592-596 5 XXXXX CH Number of herds (CCR)
- 597-602 6 XX...XX CH Number of daughters (CCR)
- 603 1 A CH Interbull usability code for CCR (0 domestic and official, 2 Interbull and official)
- 604-607 4 +/-XX.X CSL PA HCR
- 608-609 2 XX CH Reliability of PA HCR
- 610-613 4 +/-XX.X CSL PA CCR
- 614-615 2 XX CH Reliability of PA CCR
- 616-617 2 XX CH 162 Type of chip
- 618-621 4 +/-XX.X CSL Genomic inbreeding coefficient of this bull (%)
- 622-625 4 +/-XX.X CSL Genomic future inbreeding coefficient of this bull (%)
- 626-630 5 +/-XXXX CSL 33 Grazing Merit dollars (GM$)
- Livability information (introduced August 2016)
- 631-634 4 +/-XX.X CSL 33 PTA livability
- 635-636 2 XX CH 33 Reliabilty of PTA livability
- 637-641 5 XXXXX CH 33 Number of herds (livability)
- 642-647 6 XXXXXX CH 33 Number of daughters (livability)
- 648-651 4 +/-XX.X CSL 145 Parent average (livability)
- 652-653 2 XX CH 145 Reliability of PA (livability)
- Gestation Length information (introduced August 2017)
- 654-656 3 +/-X.X CSL 33 PTA Gestation Length
- 657-658 2 XX CH 33 Reliabilty of PTA Gestation Length
- 659-663 5 XXXXX CH 33 Number of herds (Gestation Length)
- 664-669 6 XXXXXX CH 33 Number of daughters (Gestation Length)
- 670-672 3 +/-X.X CSL 145 Parent average (Gestation Length)
- 673-674 2 XX CH 145 Reliability of PA (Gestation Length)
- Milk fever information (introduced April 2018)
- 675-678 4 +-XX.X CSL 33 PTA Milk Fever
- 679-680 2 XX CH 33 Reliabilty of PTA Milk Fever
- 681-685 5 XXXXX CH 33 Number of herds (Milk Fever)
- 686-691 6 XXXXXX CH 33 Number of daughters (Milk Fever)
- 692-695 4 +-XX.X CSL 145 Parent average (Milk Fever)
- 696-697 2 XX CH 145 Reliability of PA (Milk Fever)
- Displaced abomasum information (introduced April 2018)
- 698-701 4 +-XX.X CSL 33 PTA Displaced abomasum
- 702-703 2 XX CH 33 Reliabilty of PTA Displaced abomasum
- 704-708 5 XXXXX CH 33 Number of herds (Displaced abomasum)
- 709-714 6 XXXXXX CH 33 Number of daughters (Displaced abomasum)
- 715-718 4 +-XX.X CSL 145 Parent average (Displaced abomasum)
- 719-720 2 XX CH 145 Reliability of PA (Displaced abomasum)
- Ketosis information (introduced April 2018)
- 721-724 4 +-XX.X CSL 33 PTA Ketosis
- 725-726 2 XX CH 33 Reliabilty of PTA Ketosis
- 727-731 5 XXXXX CH 33 Number of herds (Ketosis)
- 732-737 6 XXXXXX CH 33 Number of daughters (Ketosis)
- 738-741 4 +-XX.X CSL 145 Parent average (Ketosis)
- 742-743 2 XX CH 145 Reliability of PA (Ketosis)
- Mastitis information (introduced April 2018)
- 744-747 4 +-XX.X CSL 33 PTA Mastitis
- 748-749 2 XX CH 33 Reliabilty of PTA Mastitis
- 750-754 5 XXXXX CH 33 Number of herds (Mastitis)
- 755-760 6 XXXXXX CH 33 Number of daughters (Mastitis)
- 761-764 4 +-XX.X CSL 145 Parent average (Mastitis)
- 765-766 2 XX CH 145 Reliability of PA (Mastitis)
- Metritis information (introduced April 2018)
- 767-770 4 +-XX.X CSL 33 PTA Metritis
- 771-772 2 XX CH 33 Reliabilty of PTA Metritis
- 773-777 5 XXXXX CH 33 Number of herds (Metritis)
- 778-783 6 XXXXXX CH 33 Number of daughters (Metritis)
- 784-787 4 +-XX.X CSL 145 Parent average (Metritis)
- 788-789 2 XX CH 145 Reliability of PA (Metritis)
- Retained placenta information (introduced April 2018)
- 790-793 4 +-XX.X CSL 33 PTA Retained placenta
- 794-795 2 XX CH 33 Reliabilty of PTA Retained placenta
- 796-800 5 XXXXX CH 33 Number of herds (Retained placenta)
- 801-806 6 XXXXXX CH 33 Number of daughters (Retained placenta)
- 807-810 4 +-XX.X CSL 145 Parent average (Retained placenta)
- 811-812 2 XX CH 145 Reliability of PA (Retained placenta)
- Early First Calving (introduced April 2019)
- 813-816 4 +-XX.X CSL 33 PTA Early First Calving (introduced April 2019)
- 817-818 2 XX CH 33 Reliabilty of PTA Early First Calving (introduced April 2019)
- 819-823 5 XXXXX CH 33 Number of herds (Early First Calving) (introduced April 2019)
- 824-829 6 XXXXXX CH 33 Number of daughters (Early First Calving) (introduced April 2019)
- 830-833 4 +-XX.X CSL 145 Parent average (Early First Calving) (introduced April 2019)
- 834-835 2 XX CH 145 Reliability of PA (Early First Calving) (introduced April 2019)"""
- # Split on all the lines
- meta_data = meta_data.split('\n')
- # For every line index
- for i in range(len(meta_data)):
- # Split that line so we now have a list of lists. [ [obs11, obs21, ...], [obs21, obs 22...], ...]
- meta_data[i] = meta_data[i].split('\t')
- # Only take the rows with actual data in it. Not the weird label things.
- meta_data = [i for i in meta_data if len(i) == 6]
- # Fields are based on the last elements in each row
- fields = [i[-1] for i in meta_data]
- # The first field is the byte range.
- # e.g. "819-823".split('-') => ["819", "823"].
- # Here i refers to the whole row
- # could also be written as: [row[0].split('-') for row in meta_data]
- pos = [i[0].split('-') for i in meta_data]
- for i in range(len(pos)):
- # Some rows only have one number, which is silly
- if len(pos[i]) == 1:
- # This just takes a single number and repeats it
- # e.g. 2*[8] => ["8","8"]
- pos[i] = 2*pos[i]
- # Everything is still in strings so let's convert those to ints
- pos[i] = list(map(int, pos[i]))
- # Remember python is 0 indexed. So our range should be
- # [start -1, end].
- pos[i][0] = pos[i][0] - 1
- def parse_line(line):
- res = []
- for start, end in pos:
- res.append(line[start:end].strip())
- return res
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement