Data Science COMP

This guide is meant for self service. If at any time during the research process, should you feel the need for research support, then email me. We will arrange a convenient time to get together on Zoom, and other platforms.

The research process

  1. Get background information from handbooks, encyclopedias, dictionaries
  1. Once your topic is narrowly defined, select databases to find specific articles that have been published in journals
  1. Look for films and images as non-literary forms of representation 
  1. Find books on your topic to gain greater depth and understanding
  1. Write down or store all the references you have consulted to include them in the bibliography of your research paper

Search tips

  1. Begin by defining exactly what you are searching for
  1. Be specific when determining keywords: synonyms/antonyms and terms to search
  1. Use the advanced interface of electronic databases and Internet search engines to help narrow your search
  • Limit results in electronic databases to full-text or peer reviewed journals only
  1. Use Boolean Operators to connect search terms by understanding how search engines operate
  1. Take notes during your research to keep track of where you have been, keywords searched, what worked and what didn't, etc.

Sources​ of information

Publication Cycle: to find primary and secondary sources of information, use tertiary sources of information: dictionaries, encyclopedias, handbooks. When a researcher publishes material, they follow the cycle clockwise. To find primary and secondary sources, follow the cycle anti clockwise.

Library search

OMNI: the Carleton University Library search portal. Please see Help With Using Omni

Subject headings

Library of Congress Subject Headings (LCSH) use controlled vocabulary to access and express the subject content of documents. Data science has largely been divided into the following subjects or research areas. â€‹This is a searchable index. Click on a link below to discover the Library's holdings in this area:

  • Add keywords
  • Use the filters on the left of the resulting screen
  • Typical filters are Available Online and Peer Reviewed Journals
  1. Algorithms
  2. Analysis
  3. Artificial intelligence
  4. Big data
  5. Big data analytics
  6. Cloud computing
  7. Computer science
  8. Computer science -- Information systems
  9. Data analysis
  10. Data management
  11. Data mining
  12. Data processing
  13. Engineering -- Electrical and electronic
  14. Information technology
  15. Internet
  16. Internet of things
  17. Machine learning
  18. Management
  19. Research
  20. Software
  21. Statistics for Business, Management, Economics, Finance, Insurance
  22. Usage
  23. Video/film
  1. Business Source Complete: Business Source Complete is a scholarly business database.  It provides full-text access as well as indexing and abstracting for journals dating back as far as 1886. PEER REVIEWED
  2. CANSIM II: CANSIM II (Canadian Socio-Economic Information Management System) is Statistics Canada's database of time series covering a wide variety of social and economic aspects of Canadian life.
  3. Factiva: A news and business information source providing global coverage.
  4. Google dataset search:  a search engine from Google that helps researchers locate online data that is freely available for use
  5. Google public data explorer:  provides public data and forecasts from a range of international organizations and academic institutions including the World Bank, OECD, Eurostat and the University of Denver.
  6. LexisNexis Academic: LexisNexis Academic is now called Nexis Uni. Provides access to news, business and legal sources from LexisNexis.  
  7. OECD iLibrary: OECD iLibrary is OECD's online library for books, papers and statistics and the gateway to OECD's analysis and data.
  8. PubMed: Provides access to citations covering all areas of medicine and associated fields.
  9. SAGE Knowledge Encyclopedias: Carleton subscribes to the 2011 Encyclopedia Collection which provides perpetual access to 27 encyclopedias in the social sciences published between 2005-2011, as well as some other encyclopedias which have been ordered individually. Browse by Content Type: "Encyclopedias" and/or "Handbooks".
  10. SAGE Research Methods: With information on the full range of qualitative, quantitative, and mixed methods for the social and behavioral sciences, as well as methods commonly used in the hard sciences, the book, reference, and journal content in SAGE Research Methods helps researchers of all levels conduct their research.
  • Cases: Cases are peer-reviewed and come with pedagogical tools including learning objectives and discussions questions
  • Datasets: Datasets is a collection of teaching datasets and instructional guides that give students a chance to learn data analysis by practicing themselves
  • Video: Video contains more than 125 hours of video, including tutorials, case study videos, expert interviews, and more, covering the entire research methods and statistics curriculum
  1. Scopus: A multidisciplinary abstract and citation database of research literature and web sources.
  2. World Bank e-Library: A fully cross-searchable portal of World Bank monograph and serial publications, including titles also issued in print.


  1. arXiv: arXiv is a free distribution service and an open-access archive for 1,769,134 scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, quantitative finance, statistics, electrical engineering and systems science, and economics. Materials on this site are not peer-reviewed by arXiv.
  2. Earth ArXiv: is moving from the OSF Preprints platform to the Janeway preprint platform at the California Digital Library (CDL). In preparation for this move, EarthArXiv will stop accepting submissions on Friday August 21, 2020. will redirect to the new service when it becomes public on October 1, 2020.
  3. TechRxiv: TechRxiv (pronounced "tech archive") is an open, moderated preprint server for unpublished research in electrical engineering, computer science, and related technology. By using TechRxiv, authors can quickly disseminate their work to a wide audience and gain community feedback on a draft version of their research. A preprint is a draft version of an article; final versions of published articles should not be submitted to TechRxiv.


  1. The SAGE handbook of applied social research methods Bickman, Leonard, 1941- editor.; Rog, Debra J., editor.; Best, Samuel J., contributor. 2009 [electronic resource]
  2. How to Read a Book, v5.0 School of Information University of Michigan
  3. How to Read a Paragraph: The Art of Close Reading
  4. How to Read (and Understand) a Social Science Journal Article: Tips and tricks to make reading and understanding social science journal articles easier
  5. How to Read for Grad School Miriam E. Sweeney

Minecraft (Education Edition) is free for Carleton students

Just download the installer from the link below (for Windows 10):

After install, launch the game and log in using your Carleton email address (cmail) and password then you're good to go. 

Canada - Federal: 

  1. Baldwin-Green Study: Canada - U.S. Census of Industry 1867-1940: Canadian and US manufacturing industries at the 2-digit SIC code level for census years 1900 to 1940. The Canadian figures start at 1870. Only general figures were recorded, such as the number of employees, the number of establishments, the salary and wages
  2. Canada Year Book tables: selected statistics from 1907 to 1967 at ten year intervals.
  3. Canadian Alcohol and Drug Use Monitoring Survey (CADUMS): upon request
  4. Canadian Astronomy Data Centre
  5. Canadian Election Studies: why people vote the way they do ... what does and does not change during the campaign and from one election to another.
  6. Canadian Opinion Research Archive (CORA): to explore data holdings, click on tabs on left
  7. CISTI's Gateway to Research Data: scientific, technical and medical (STM) data sets from a broad range of scientific disciplines.
  8. Historical Canadian Macroeconomic Dataset 1871 - 1994: Includes GNP, implicit price deflator, population, real GNP, per capita GNP, government expenditures, exports, imports, money suppy, bond yields, investment expenditures, current account balance.
  9. Historical Statistics of Canada, 2nd Ed.: contains about 1,088 statistical tables on the social, economic and institutional conditions of Canada from the start of the Confederation in 1867 to the mid-1970s.text as HTML pages and all tables as individual spreadsheets in comma separated value files
  10. Inflation Calculator from 1914 to the present: from the Bank of Canada; based on Stats Canada's CPI data.
  11. National Climate Data and Information Archive
  12. National Pollutant Release Inventory (NPRI): Find out about pollutant releases and transfers by postal code
  13. Open Data Portal: This pilot project provides a "single-window" to data already published by individual departments and agencies on their public Websites.


  1. Advanced Businss Analystics, Data Mining, and Predictive Modeling: LinkedIn
  2. ASIS&T: Association for Information Science and Technology
  3. Association of Information Technology Professionals: AITP is the leading worldwide society of professionals in information technology
  4. Best Practices for Preparing Environmental Data Sets to Share and Archive: data management pages for data providers to the ORNL Distributed Active Archive (DAAC)
  5. Big Data Visualization: LinkedIn
  6. Business Intelligence Connections: LinkedIn
  7. CDL (California Digital Library): the CDL has continually broken new ground by developing systems linking our users to the vast print and online collections within UC and beyond
  8. CURVE: Carleton University Research Virtual Environment
  9. Data Scientists: LinkedIn
  10. Data Visualization: LinkedIn
  11. Databib: a tool for helping people identify and locate online repositories of research data.
  12. DataCite: A list of repositories for research data.
  13. DataFinder: from the Population Reference Bureau
  14. DataOne (Data Observation Network for Earth)
  15. Data-Planet Statistical Datasets
  16. Digital Curation Centre: (DCC) is a world-leading centre of expertise in digital information curation with a focus on building capacity, capability and skills for research data management across the UK's higher education research community
  17. FAA National Wildlife Aircraft Strike Database: contains records of reported wildlife strikes since 1990
  18. Gateway to Research Data: The NRC Gateway to Research Data provides central access to Canadian scientific, technical and medical (STM) data sets and other important data repositories, as well as links to selected policies and best practices guiding data management and curation activities in Canada
  19. Global Big Data and Analytics: LinkedIn
  20. Government Accountability Office (GAO)/US Government: advises Congress and the heads of executive agencies about ways to make government more efficient, effective, ethical, equitable and responsive
  21. Households and the Environment: Statistics Canada
  22. IASSIST: is an international organization of professionals working in and with information technology and data services to support research and teaching in the social sciences
  23. ICPSR Data Archive: provides leadership and training in data access, curation, and methods of analysis for a diverse and expanding social science research community
  24. Infochimps: delivers a cloud service solution for Big Data that eliminates the struggle to master all the new Big Data technologies
  25. Innovation Enterprise:  is an independent business-to-business multi-channel media brand focused on the information needs of Senior Big Data, Strategy, Advanced Analytics, Digital, Finance, Operations, Publishing & Decision Support executives
  26. Institute for Data Science at Carleton University
  27. IPUMS (Integrated Public Use Microdata Series): is one of the world's leading developers of demographic data resources
  28. IQSS Dataverse: The Harvard Dataverse Network is open to all scientific data from all disciplines worldwide; includes the world's largest collection of social science research data
  29. JISC: the UK's expert on digital technologies for education and research
  30. KDnuggets: News on Analytics, Big Data, Data Mining
  31. Lavastorm Analytics Community Group: LinkedIn
  32. Odum Institute Dataverse Network - data catalog: provides access to data collections curated by the Odum Institute as well as collections owned by other institutions and individual scholars
  33. OECD Environmental Data Compendium: revised regulary, presents data linking pollution and natural resources with activity in such economic sectors as energy, transport, industry and agriculture; It shows the state of air, inland waters, wildlife, etc., for OECD countries and describes selected reponses by government and enterprises
  34. Open Data Portal (Canada): a key part of Canada’s Action Plan on Open Government to enhance transparency and accountability. provides one-stop access to Government of Canada data and information
  35. O'Reilly Strata: LinkedIn
  36. PMI Marketplace: PMI’s worldwide advocacy for project management is reinforced by our globally recognized standards and certification program, extensive academic and market research programs, chapters and communities of practice, and professional development opportunities
  37. Project Management Institute: We serve practitioners and organizations with standards that describe good practices, globally recognized credentials that certify project management expertise, and resources for professional development, networking and community
  38. Research Data Alliance: aims to accelerate and facilitate research data sharing and exchange
  39. Sociometrics: science-based products for researchers & practitioners
  40. Statista: aggregates statistical data on over 600 international industries from more than 18,000 sources, including market researchers, trade organizations, scientific journals, and government databases
  41. Strata Conference: the essential training and information source for data science and big data—with industry news, reports, in-person and online events, and much more
  42. Summits calendar: Innovation Enterprise is an independent B2B multi channel media brand, focused on the information needs of Senior Big Data, Finance, Operations, Planning, ...
  43. Top 30 LinkedIn Groups for Analytics, Big Data, Data Mining and Data Science 
  44. World Bank/Documents and Reports: To ensure that countries can access the best global expertise and help generate cutting-edge knowledge, the Bank is constantly seeking to improve the way it shares its knowledge and engages with clients and the public at large
  45. Zanran: gets you more meaningful numerical results than any other search engine
Content last reviewed: January 7, 2021