1881 occupational/age data for parishes and sub-districts, from microdata.

Table ID:
OCC_1881_EW     (1251838)
Contents:
1881 occupational/age data for parishes and sub-districts, from microdata.
Approx. number of rows:
3,638,880
Table type:
Raw Data
Documentation Author:
Humphrey Southall
Geography:
Reporting units are identified by:
   Registration District Number
   Parish Identifier
Chronology:
The data are for the single year 1881.

Sources:

  1. These data were derived from the complete transcription of the census Enumerators' Books for 1881, created by the Genealogical Society of Utah in collaboration with family history societies throughout Britain.
  2. The underlying data used here were very extensively enhanced by a project based in the Department of History at the University of Essex, led by Kevin Schurer with Matthew Woollard as the senior researcher. This project was initially funded by the Leverhulme Trust, as 'The 19th Century Censuses Collection' (Humphrey Southall was a co-applicant on that project), and subsequently by the ESRC under their 'Future of Work' Programme. In particular, Matthew Woollard assigned each of the c. half million occupational titles that appeared in the original transcription to the 414 occupational categories that were used in the published reports of the 1881 census.
  3. The data set held here was derived from the enhanced version of the 1881 microdata by Hamish James of the History Data Service, also at Essex University, in September 2002 and supplied to the GBHGIS project at that time.


Notes:

  1. The data were created by the History Data Service via the following query applied to their database holding the microdata:

    select reg_cnty, subd_id, parid, occode, sex, hds_agegroup, count(whole1881.recid) as freq into hs1
    from whole1881
    group by reg_cnty, subd_id, parid, occode, sex, hds_agegroup
    order by reg_cnty, subd_id, parid, occode, sex, hds_agegroup
  2. This data set is designed to be usable by either Registration sub-Districts or parishes, so it contains separate rows for the different parts of parishes which are divided into more than one sub-district.
  3. The dataset uses various identifiers defined by the 1881 project, so must be used in conjunction with a series of lookup tables/codebooks. As received from the HDS, the dataset contained a small number of rows which did not match the codebooks, so the following changes were made as the data were loaded. NB the counts below are out of over 3 million rows:
    • 19,519 were excluded because their values for regc_code indicated that they were in the Channel Islands, the Isle of Man or an 'Unknown' area. NB these were the only rows that were deleted rather than modified.
    • 25 rows had values of sex that were not 'M', 'F' or 'U' (unknown), including 14 rows with the value 'Q', which may indicate a query. All these rows were reassigned to 'U'.
    • Similarly, six rows had values of age_group above 30 (other than '99', which indicates missing data). All these values were reassigned to age group 30, which idicates ages 145-9.
    • There were no similar problems with the three geographical identifiers, or the occupation codes.
    • After the above revisions, ten pairs of rows had matching values of all identifiers (geography, sex, age and occupation. This was dealt with by merging rows.
    The final result is that all rows link to the codebooks, and the only frequency counts for England and Wales that have been excluded are those where the County is unknown.


Checking:

  1. Other than the above changes to a small number of rows, the data are exactly as supplied by the History Data Service. A number of cross-checks against the published 1881 tables for age-groups and occupations, and for parish-level populations, are possible but have yet to be applied. Given the large problems in working with the micro-data, exact matches with the published data are unlikely.


Acknowledgments:


We are extremely grateful to the following:

  1. Hamish James: These data were derived from the 1881 micro-data by Hamish James, whose contribution should be acknowledged.


Indices:

IndexTypeColumn(s) indexed
occ_1881_ew_pk Primary key subd_id, par_id, occ_code, sex, age_group
occ_1881_ew_reg_idx   reg_num, occ_code, sex, age_group


Constraints:

The table has the following associated constraints:

ConstraintTypeDetails
occ_1881_ew_pk Primary Key See details above for primary key index
occ_1881_ew_fk_age_group Foreign Key Column(s) {7} link to table occ_1881_age_groups, column(s) {1}.
occ_1881_ew_fk_occ_code Foreign Key Column(s) {5} link to table occ_1881_codebook, column(s) {1}.
occ_1881_ew_fk_places Foreign Key Column(s) {3,4} link to table occ_1881_places, column(s) {2,3}.
occ_1881_ew_fk_regc Foreign Key Column(s) {1} link to table occ_1881_regc_codes, column(s) {1}.
occ_1881_ew_ck_sex Check ((((sex)::text = 'M'::text) OR ((sex)::text = 'F'::text) OR ((sex)::text = 'U'::text)))



Columns within table:

ColumnTypeContents
regc_code Text string (max.len.=7). A three-letter abbreviation for the name of the Registration County, as defined in 'occ_1881_ew_regc_codes'.
reg_num Integer number. Number identifying the Registration District, as extracted from subd_id and as used elsewhere in the 1881 census reports.
subd_id Integer number. Combined Registration District and sub-District ID, as defined in occ_1881_places.
par_id Integer number. Number identifying each parish, as defined in occ_1881_places.
occ_code Integer number. Code number identifing the occupational category, as defined on occ_1881_codebook. These run from 1 to 414, with 999 indicating missing values.
sex Text string (max.len.=5). Gender: 'M' = Male, 'F' = Female, 'U' = Unknown.
age_group Integer number. Age group, in five year bands: 1 = 0-4, 2 = 5-9, etc, as defined in occ_1881_age_groups. This was generated using the function: ceiling((whole1881.age+1)/5). After recoding a very small number of impossibly high values, these codes run up to 30, for ages 145 to 149, with 99 indicating missing data
freq Integer number. Number of persons in this combination of categories.
rec_num Integer number. Sequence number added to keep rows in order.