New advancements in digital technologies – often referred to as the Fourth Industrial Revolution or the Second Machine Age – are poised to significantly disrupt Canada’s labour force. While recent research suggests that automation and technological change won’t necessarily lead to a considerable increase in unemployment, findings agree that it will significantly change the demand for skills.
MaRS launched planext in July 2019, a career navigation tool to help job seekers navigate the future of work. Since the launch of planext, we have been engaging the ecosystem through a Future Skills Centre funded project, Project Integrate. This initiative is testing the potential impact and feasibility of a single technology-enabled employment and training pathway for youth. Working with employment service provider networks across Canada, the project is conducting systems research and field testing with a range of promising employment-related technologies in each of the following three phases in the employment pathway: Engagement, Systems Navigation and Career Laddering. The project centres around ecosystem engagement, and this engagement has taught us a lot about how we can improve planext for job seekers.
With the global pandemic disrupting every facet of the workplace, we have come to realize that the pathways should not solely revolve around automation. Rather, employment pathways can be more useful if they consider all relevant disruptions to one’s work. Disruptions that have been highlighted throughout my internship are (1) automation, (2) large-scale unemployment resulting from the COVID-19 pandemic, (3) worker displacement due to workplace injuries and (4) worker migration between sectors expected to see decreased demand towards emerging sectors, such as oil and gas to clean energy.
We, at MaRS, have thought a lot about how data and analytics can be used to equip workers to make more informed and timely decisions about their education and careers. Building employment pathways is challenging and the quality of pathways heavily depends on the quality of the labour market information being used. We’ve explored ways to measure the demand for skills, the economic value of skills and the skill intensity or level required by particular jobs, as well as ways to map skills to training and education opportunities and to translate research about the impacts of automation from an occupation level to a job level in order to make these insights more accessible to job seekers. We’ve become intimately familiar with the challenges inherent within these kinds of analyses; from using unstructured and messy job posting data to the nature of job descriptions where certain skills are explicitly stated while others are implicit. We also recognize that job attributes are only one piece of the puzzle, the economic and regulatory environments are major factors when considering the sustainability of a job or career path. We have an appreciation for the complexity and multiplicity of the challenges faced by marginalized job seekers as well as our ability to understand these challenges using data.
Since the release of our minimum viable product, we have engaged in beta-testing to understand the experience of job-seekers with planext, identify their pain points and overall compatibility of their needs with what is delivered in planext. The most common feedback received to planext was that users could not find their jobs. Choosing a set of job titles to fully represent the workforce is not easy, further, identifying how to move between them is even more complicated. The first iteration of planext relies on O*NET, a standard and informative dataset of labour information collected and curated by the US government. O*NET provides a clear set of nearly 1000 occupations, each defined with relevant traits. While this structure allows us to model adjacency between jobs, this rigid structure hides the messy reality of the job market. Not only is 960 job titles too few for the range of existing jobs available today, many cannot succinctly represent their job using a common title. For these reasons, O*NET, on its own, provides an inherent limitation to planext’s ability to capture the workforce.
Our primary objective this summer was to provide improved pathways between a larger set of job titles than is used in the current data model powering planext. As such, we focused on integrating job posting data into planext. As job-seekers regularly interact with job posting data, we believe this data will enable users to more easily locate their jobs in planext, as well as provide a more granular set of occupations upon which we can model job distances and build pathways. With this, we hope to solve the issue of users being unable to locate themselves in the current system, and provide improved pathways to job-seekers. These pathways can then be paired with workplace analyses to enable users to navigate disruptions to work.
In the context of Project Integrate, it is critical that we strive for the best possible career pathway results possible for the youth job seekers that are using planext every day. Our work focuses on exploring the use of digital tools, like planext, in each phase of the employment pathway: Engagement, Systems Navigation and Career Laddering. planext is a tool that focuses specifically on the third phase, career laddering. Thus, the improvement of the matching algorithm in the tool required the integration of job posting data and new modelling approaches. As we continue to put planext in front of job seekers, we are really excited to test the new pathways and learn more about how planext can better serve the job seekers of tomorrow.
Predicting automation is far from trivial. Faethm.ai, whose automation risk scores we used in planext, followed a method from an academic paper  published 7 years ago. This paper has multiple stated caveats that it only considers technical feasibility, ignoring political and economical factors, many of which can impact the adoption of automation as well as the direction automation takes. Further, some recent academics have proposed that automation is better viewed at a task level rather than a job level. In , after predicting task automation risks, they define jobs as an aggregation of tasks; this approach leads to significantly different results than cite, with an Spearman’s rank correlation of 0.04. This second paper also warns readers that these predictions will likely be wrong in a number of aspects. Overall, to responsibly provide automation risks will require strong technical expertise and significantly more validation than the current model.
Defining employment pathways is also a challenging task. Choosing a set of job titles to fully represent the workforce is non-trivial, and identifying how to move between them is even more complicated. The first iteration of planext relies exclusively on O*NET, a standard and informative dataset of labour information collected and curated by the US government. O*NET’s data is easy to use as it provides a clear set of 960 occupations, each defined with relevant traits. However, this rigid structure hides the messy reality of the job market. In fact, planext’s most common feedback was that users could not find their jobs. Not only is 960 job titles too few for the range of existing jobs, many people have the wrong title for their job, and many more cannot succinctly represent their job using a common job title. Further, the skills defining O*NET occupations — including critical thinking, communication, problem solving — are too broad and general to properly capture one’s ability to change jobs. For these reasons, O*NET, on its own, provides an inherent limitation to planext’s ability to capture the workforce.
Finally, planext’s website, while elegant and well designed, is tailored towards individuals who likely lack the context to properly grasp the information provided. As previously mentioned, providing career pathways is incredibly challenging. As such, it is almost certain that regardless of the quality of the model, in certain instances some of the information it provides will be erroneous, or imperfect. This does not remove the model’s potential as an explorative tool, but it does mean that it should not be used as a prescriptive tool. To ensure that it is used correctly, it is better to shift our primary users from individual users to employment service providers and/or career counselors who can use this tool to see possible options, but who can also use their own knowledge to fill the gaps in our model.
We hope to develop a tool which will guide and provide a foundation for the remainder of the project. This tool’s primary objective is to provide improved pathways between a larger set of job titles than the current model. With this, we hope to solve the current issue of users being unable to locate themselves in the current system, and provide more useful and realistic pathways to users. These pathways can then be paired with workplace analyses to help users navigate the disruptions that are or will affect them. The scale and potential of this project extend beyond the resourcing and time available for this work at MaRS.
To identify pathways between a large set of jobs, we need to select a set of job titles to use and a way to represent each job with numerical values such that the distance between two jobs represents the mobility between them. To produce these numerical values, called embeddings, we use a dataset of job postings in conjunction with O*NET. The job postings provide real world data and fills the gaps between O*NETs’ limited occupations, while O*NET’s curated structure provides a way for us to validate our models. From these job postings, we extract features as a starting point, then develop, test, and compare several methods to produce embeddings that map well to the known structure of O*NET. Finally, for each job title, we average the embeddings of each posting that matches that job title, and create pathways based on the proximity of one job title to another.
As previously mentioned, O*NET’s structured data limits our ability to capture the intricacies required for this project. We therefore augment the data using job posting data from two sources: (1) Vicinity Jobs and (2) job postings scraped by Datastock from indeed.com. Vicinity contains millions of records of job postings from 2016 to the present with extracted fields for each job posting, some of which include job title, geography information, NOC ID, skills, remuneration. The Indeed dataset contains a little over a million raw job postings with the full job description. The descriptions are scraped directly with no formatting or normalization. While no job can be fully captured by its job description, we believe the integration of these two datasets provides a more complete source of job posting information. In order to validate our methods and provide some structure to the data, we label each posting with one or more O*NET occupation by comparing the postings title to O*NET’s occupation titles and alternate job titles. By using a matching system, we were able to label over 95% of the job postings. By providing these labels, we enable comparison between job postings on four increasing levels of precision based on the SOC groupings: major groups, minor groups, broad occupations, and detailed occupations.
The next step in embedding postings is to decide the features to use. With the job descriptions being raw, unnormalized text, and to provide a more flexible input for the downstream tasks, we avoid the more state-of-the-art language models which take full sentences as input, and instead choose a bag of words approach. Since many important skills and work related attributes are better expressed in multiple words — e.g. Microsoft Excel, Agile Software Development, Human Resource Coordinator — we choose n-grams up to 5 in length instead of single words as our vocabulary. To determine this n-gram dictionary, we aggregate all of the skill information from Vicinity Jobs across all job postings in addition to the following O*NET attributes: skills, knowledge, work activities, job titles, alternate job titles, and sample job titles. We then use this dictionary to represent each job posting as a set of relevant n-grams.
Recognizing that education is one of the important clearly outlined criteria for job postings. We therefore associate each job posting with zero or more of the following levels of education: Certificate, Diploma, Bachelor, Masters, Doctorate, Law, MD, or Other by searching for education keywords in the posting.
The first and simplest embedding method is to make a multi-hot encoding using the set of n-grams. In this method, you have a vector with a length equal to the size of the dictionary, where every index represents a specific n-gram. If the n-gram is present in the posting, it has a value of 1, otherwise it has a value of 0. While simple, it creates a very large embedding and the distance between two n-grams is identical regardless of how similar those n-grams are. For example, a posting that only contains Java is the same distance away from a posting that only contains Python as a posting that only contains Plumbing.
This is the method used in all cases to embed the education information. The multi-hot education embedding is concatenated with n-grams embedding, regardless of how the n-grams are embedded.
The second method uses a word embedding model, specifically fastText. We first embed every n-gram using fastText into a vector of size 300. We then use one of three methods to combine the set of embeddings:
We find that the third method provides the best results. This method avoids the issue where all n-grams are equal-distance apart, however we experience information loss in the aggregation of the n-grams.
This final method builds upon the previous methods and uses a machine learning trained model to predict the O*NET occupations labels. After testing the two previous methods (see Results sections), we found the multi-hot embedding to perform slightly better, and therefore use this as the input. We use a simple 3 layer feed forward neural network (of sizes 512, 256, and 128 respectively) to generate embeddings of size 128. We attach a classification layer on top of this embedding for each of the four O*NET group sizes (major, minor, broad, detailed). As many job postings have multiple O*NET labels, we cannot use regular cross entropy. Instead, the labels are represented so that if there is one label, it is equivalent to one-hot encoding, with two labels, you have two entries with a value of 0.5 at the respective indices of the labels, with three, three entries with 0.3333, and so on. Our loss is defined as the KL divergence between the output of the classification layer and the vector representation of the label. Due to label imbalances, we multiple the loss to weight infrequent labels more and frequent labels less. We additionally include a negative entropy term to help exploration which we found beneficial.
Finally, to generate pathways, we aggregate postings by job title. If a posting has multiple titles, we associate it with each title with a reduced weight based on how many titles it has. We then use a weighted average to obtain a single embedding per job title. For each job title, we then find the nearest job titles as possible career changes. By stringing multiple of these choices in a row, we obtain a pathway. There is flexibility here in how many jobs we show as possibilities given any specific starting job, as well in filtering certain titles based on chosen criteria.
With multiple labels per job posting and no “ground truth” job posting to compare against, there is no simple metric to use. First attempts at using cluster membership ran into difficulties as most clusters, especially using the simpler embedding methods, did not clearly correlate with any single label. Instead, we focus on our primary objective that similar labels have similar embeddings. With this in mind, we develop a metric, which we name neighbour overlap, where for each posting we calculate the 10 closests job postings and count the number of them that share a label with the original posting. We also count the percent of postings that have 0 similar postings within their 10 nearest neighbours. We run this metric for each of the four O*NET group sizes. Due to the O(n2) complexity of creating a difference matrix, we run this metric on a subsample of 10,000 postings. While imperfect as some labels have fewer than 10 instances in the 10,000 subselection, we believe this metric closely approximates the essence of what we are trying to achieve.
To validate the final model, we also use human validation. We ask humans to compare the existing planext career paths to career paths provided by our new model from 10 different and varied starting occupations. Specifically, we ask them to choose the career path they deem more feasible both for the first step (from the first occupation to the second occupation) and a full career path consisting of four occupations.
Embedding method comparison
For each O*NET grouping, we provide the mean neighbour overlap and the percent of postings that have zero overlap (%0 OL). For mean neighbour overlap, the higher the number the better the model. For %0 OL, the lower the percent, the better the model. From both metrics, and from the visualization below, it is clear that the machine learning model outperforms both other models.
|Major Group||Minor Group||Broad Occupation||Detailed Occupation|
|Mean||%0 OL||Mean||%0 OL||Mean||%0 OL||Mean||%0 OL|
Below we also show visualization using each of the methods, where each color represents a different major group. 1. is using multi-hot embedding, 2. is using word embedding, and 3. is using machine learning embedding.
Final model validation
We choose 10 starting occupations, including the four most searched occupations in the original planext, and create a pathway for each occupation using both the original model and the new model. For both, we select the most similar occupations at each step. In every single case, humans found that the new model produced more feasible first step career changes. In all but one case, humans found the new model produced better full career paths.
Summary of preference for pathways produced by the new model compared to the old model
Example pathways from the old model (in yellow) and the new model (in green) below.
|Starting occupation||2nd occupation||3rd occupation||4th occupation||Note|
|Web Developers||Web Administrators||Methane/Landfill
Abstractors and Searchers
|Most improved full career path — 94.1% voted for the new model|
|Web Developers||Software Developer||Application Developer||Java Developer|
|Hairdressers||Skincare Specialists||Opticians, Dispensing||Gaming Manager||Most improved single career step — 94.1% voted for the new model|
|Hairdressers||Hair Stylist||Hair Colorist||Stylist|
|Marketing Managers||PR and Fundraisers||Fundraisers||Agents and Business Managers of Artists, Performers and Athletes||Only full career path where people preferred the old model — 58.8% voted for the old model|
|Marketing Managers||Marketing Specialist||Marketing Associate||Digital Marketing Manager|
Looking at the results, there are a few clear take-aways. First and foremost, we are successful in producing improved pathways using the new approach. Secondly, we notice that overall the old method produced pathways where the change in occupation is significantly larger than the new models. This is due to the fact that we have significantly increased the number of possible jobs, which means that smaller steps exist.
We succeeded in incorporating a larger selection of occupations from which a user can begin their career path and via the human validation process indicated that the new model produced an improved set of pathways. However, the model still requires testing and improvement. We first outline two small fixes that exist in the current planext model that can quickly improve the results of the current model. We then outline larger long-term improvements that could be made.
While the limited occupation set of the old model created career paths that often resulted in huge career leaps, the expanded set of occupations now allows — and often provides — career changes where the barrier to transition is very small. It is likely that applying a simple filter to force the career change to be larger than a minimum distance away can remedy this. Further, the old model imposed rules to suppress recommended occupations to avoid significant downward movement using education, experience, training and salary information. This is currently not implemented with the new model, and as such, sometimes recommends inappropriate downward movement. While more difficult to implement as the simple one-to-one mapping provided by O*NET does not exist, it is possible to first map each job title to a set of O*NET occupations and either average or use the maximum values across this set to produce the necessary information for this kind of guidance. The usefulness of this path depends on the quality of the mapping from job postings to O*NET occupations and the loss information that could result from aggregating information.
While we currently include education level as part of the input to machine learning model to create embeddings, we do so in a very simple manner. This input could be improved in a number of ways:
In the context of Project Integrate, it is critical that we strive for the best possible career pathway results possible for the youth job seekers that are using planext every day. Our work focuses on exploring the use of digital tools, like planext, in each phase of the employment pathway: Engagement, Systems Navigation and Career Laddering. Planext is a tool that focuses specifically on the third phase, career laddering. Thus, the improvement of the matching algorithm in the tool required the sophisticated approach described here. As we continue to put planext in front of job seekers, we are really excited to test the new pathways and learn more about how planext can better serve the job seekers of tomorrow.
 Frey, Carl Benedikt, and Michael A. Osborne. “The future of employment: How susceptible are jobs to computerisation?.” Technological forecasting and social change 114 (2017): 254-280.
 Brynjolfsson, Erik, Tom Mitchell, and Daniel Rock. “What can machines learn, and what does it mean for occupations and the economy?.” AEA Papers and Proceedings. Vol. 108. 2018.
In the word2Vec corpus …
Word2vec was trained …