Thumbnail
DLS-2025-10-crashplan-navy.pdf

Avoid the Data Loss Nightmare with CrashPlan

Imagine losing years of research in an instant: the data you collected, analyzed, and relied on are gone. Accidental deletion, hardware failure, fire, or a lost device can strike without warning, and it happens more often than you think. Protect your work before it’s too late with CrashPlan, available for free to UCSB researchers, and don’t let data loss come back to haunt you.

Perma Link

PDF - ALT
TAGS: Storage, Data Backup, Loss Prevention, Data Recovery
DATE: 10-2025


Thumbnail
DLS-2025-09-github-zenodo.pdf

From Push to Publish: Preserving GitHub Projects with Zenodo

GitHub is a fantastic tool for version control and collaboration during the active phases of a project, but it is not designed for permanent archiving. For long-term preservation and accessibility, deposit your work in a repository like Zenodo, which assigns a digital object identifier (DOI), allowing your project to be reliably cited in the years to come. Thanks to GitHub’s integration with Zenodo, creating an archived snapshot is quick and easy.

Perma Link

PDF - ALT
TAGS: Citation, Code Documentation, Code Sharing, Data Preservation
DATE: 09-2025


Thumbnail
DLS-2025-08-datarescue_navy.pdf

Keeping Access to Public Datasets Afloat

Datasets and other digital resources are fragile, often lost due to removals, shifting priorities, lapses, or changes in hosting. These losses disrupt access, hinder research and teaching, and undermine scientific reproducibility. Discover how the Data Rescue Project and the Research Data Services Department at the UCSB Library can help preserve public datasets and ensure their ongoing accessibility.

Perma Link

PDF - ALT
TAGS: Data Access, Data Archiving, Data Preservation
DATE: 08-2025


Thumbnail
DLS-202507-intercodereliability-navy.pdf

Intro to Intercoder Reliability

Intercoder or inter-rater reliability refers to the degree of agreement among independent coders in their categorization or interpretation of data. High reliability reflects not only the consistent application of coding criteria but also a meaningful level of consensus among coders. This suggests that the analysis is not merely subjective, but systematic and replicable. Such consistency and shared understanding are essential for establishing the trustworthiness, rigor, and credibility of research findings.

Perma Link

PDF - ALT
TAGS: Data Analysis, Statistics, Cohen's Kappa, Reliability, Rater Agreement
DATE: 07-2025


Thumbnail
DLS-2025-06-StatsPitfalls-navy.pdf

Common Stats Pitfalls

Understanding widespread misconceptions in statistics is essential for anyone working with quantitative data. By recognizing these pitfalls, researchers can more critically evaluate statistical claims, design more robust studies, analyze data more effectively, and report findings with greater accuracy and confidence.

Perma Link

PDF - ALT
TAGS: Data Analysis, Quantitative Data, Reproducibility, Statistics
DATE: 06-2025


Thumbnail
DLS-2025-05-TextPreprocessing_navy.pdf

The Basics of Text Preprocessing

Text preprocessing is a crucial first step in transforming unstructured text into machine-readable data. It involves cleaning, organizing, and standardizing language to establish a reliable foundation for analysis and interpretation. By removing noise and inconsistencies, preprocessing enhances algorithm performance, leading to more accurate results in tasks such as sentiment analysis, classification, and information retrieval. While the specific workflow will depend on your research question and analytical goals, here is a breakdown of some common steps, along with an example

Perma Link

PDF - ALT
TAGS: Text Analytics, Natural Language Processing, Data Cleaning, Data Preparation
DATE: 05-2025


Thumbnail
DLS-202504-dataquality-navy.pdf

Cultivating Quality in Tabular Data

Poor data quality can result in unreliable analysis, inaccurate conclusions, and wasted effort. Since 'quality' is broad and often subjective, we break it down into key dimensionseach with guiding questions to help evaluate critical attributes of tabular data.

Perma Link

PDF - ALT
TAGS: Tabular Data, Quality Control
DATE: 04-2025


Thumbnail
png2pdf.pdf

Decoding Character Encoding Issues

Character encoding is the system of assigning numeric correspondence to characters, such as letters and symbols, so computers can store, process, and share them. Encoding issues have become less common in data handling with the widespread adoption of the UTF-8 standard. Still, some researchers may experience problems when working with data from legacy systems or old databases. Here, we cover some of the basics of character encoding standards and tips for researchers to avoid potential problems.

Perma Link

PDF - ALT
TAGS: Data Management, Data Cleaning, Data Organization
DATE: 03-2025


Thumbnail
DLS-202502-secondarysources-navy.pdf

Eight Tips for Handling Secondary Data

Have you identified any pre-existing data that could be relevant to your project? When reusing someone else's data, it's crucial to follow key steps to ensure proper documentation and its provenance. This includes detailing its origin, context, and lineage, which helps maintain transparency and traceability throughout your work.

Perma Link

PDF - ALT
TAGS: Data Analysis, Data Documentation, Data Reuse
DATE: 02-2025


Thumbnail
DLS012025-CHAFF-navy.pdf

Separating Research Files from the CHAFF

When opening files across different operating systems, you may come across irrelevant files, often called Concealed, Hidden, And Forgotten Files (CHAFF). Want to learn how to identify and delete them before sharing your project? Keep reading!

Perma Link

PDF - ALT
TAGS: Data Sharing, File Management, Project Organization
DATE: 01-2025