| Title: | Processing and Analysis of 'O-GlcNAcAtlas' Data |
|---|---|
| Description: | Provides tools for processing and analyzing data from the 'O-GlcNAcAtlas' database <https://oglcnac.org/>, as described in Ma (2021) <doi:10.1093/glycob/cwab003>. It integrates 'UniProt' <https://www.uniprot.org/> API calls to retrieve additional information. It is specifically designed for research workflows involving 'O-GlcNAcAtlas' data, providing a flexible and user-friendly interface for customizing and downloading processed results. Interactive elements allow users to easily adjust parameters and handle various biological datasets. |
| Authors: | Yaoxiang Li [aut, cre] |
| Maintainer: | Yaoxiang Li <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.6 |
| Built: | 2026-05-27 07:05:04 UTC |
| Source: | https://github.com/yaoxiangli/oglcnac |
Summarizes added, removed, and shared Atlas row IDs.
compare_atlas_tables(old_data, new_data, id_col = "id")compare_atlas_tables(old_data, new_data, id_col = "id")
old_data |
Previous Atlas data frame. |
new_data |
Updated Atlas data frame. |
id_col |
Row identifier column. Defaults to |
A one-row data frame with row and ID counts.
This function compares the original input tibble and the updated tibble, identifying and reporting any changes in the specified columns ('entry_name', 'protein_name', 'gene_name').
compare_tibbles_uniprot( original_tibble, updated_tibble, entry_name_col = "entry_name", protein_name_col = "protein_name", gene_name_col = "gene_name" )compare_tibbles_uniprot( original_tibble, updated_tibble, entry_name_col = "entry_name", protein_name_col = "protein_name", gene_name_col = "gene_name" )
original_tibble |
The original tibble before processing. |
updated_tibble |
The tibble returned after processing. |
entry_name_col |
The column name for entry names (default: "entry_name"). |
protein_name_col |
The column name for protein names (default: "protein_name"). |
gene_name_col |
The column name for gene names (default: "gene_name"). |
None. Prints the differences between the tibbles.
# Example usage: # Original input tibble input_data <- tibble::tibble( id = c(1, 2), species = c("mouse", "rat"), sample_type = c("brain", "liver"), accession = c("O88737", "Q9R064"), accession_source = c("UniProt", "UniProt") ) # Process the tibble (this will add the entry_name, protein_name, and gene_name) processed_data <- process_tibble_uniprot(input_data) # Compare the original and processed tibbles compare_tibbles_uniprot(input_data, processed_data)# Example usage: # Original input tibble input_data <- tibble::tibble( id = c(1, 2), species = c("mouse", "rat"), sample_type = c("brain", "liver"), accession = c("O88737", "Q9R064"), accession_source = c("UniProt", "UniProt") ) # Process the tibble (this will add the entry_name, protein_name, and gene_name) processed_data <- process_tibble_uniprot(input_data) # Compare the original and processed tibbles compare_tibbles_uniprot(input_data, processed_data)
Writes a validated Atlas table to CSV with blank cells for missing values.
export_atlas_csv(data, file, dataset = NULL)export_atlas_csv(data, file, dataset = NULL)
data |
A data frame containing Atlas records. |
file |
Output CSV path. |
dataset |
Optional dataset label: |
The output file path, invisibly.
This function launches a Shiny App for uploading, processing, and downloading UniProt data in CSV, TSV, or Excel format. Users can upload data, preview it, and select specific columns for processing. The processed data can be viewed and downloaded.
launch_app()launch_app()
None
if (interactive()) { oglcnac::launch_app() }if (interactive()) { oglcnac::launch_app() }
This function parses the data retrieved from the UniProt API to extract the entry name, protein name, and gene name.
parse_uniprot_data(uniprot_data)parse_uniprot_data(uniprot_data)
uniprot_data |
A list returned by the UniProt API query. |
A list containing 'entry_name', 'protein_name', and 'gene_name'.
# Example usage: # Retrieve UniProt data test_result <- retrieve_uniprot_data("O88737") # Parse the UniProt data parsed_result <- parse_uniprot_data(test_result) # Print the parsed result print(parsed_result)# Example usage: # Retrieve UniProt data test_result <- retrieve_uniprot_data("O88737") # Parse the UniProt data parsed_result <- parse_uniprot_data(test_result) # Print the parsed result print(parsed_result)
Adds or normalizes the ambiguous field used by the public static website.
prepare_atlas_data(data, dataset = NULL)prepare_atlas_data(data, dataset = NULL)
data |
A data frame containing Atlas records. |
dataset |
Optional dataset label: |
A data frame ready to export.
This function processes a tibble containing accession and accession_source columns. It retrieves data from the UniProt API for rows with accession_source == "UniProt" and overwrites or creates the entry_name, protein_name, and gene_name columns only if the parsed values are not NULL or NA.
process_tibble_uniprot( data, accession_col = "accession", accession_source_col = "accession_source", entry_name_col = "entry_name", protein_name_col = "protein_name", gene_name_col = "gene_name" )process_tibble_uniprot( data, accession_col = "accession", accession_source_col = "accession_source", entry_name_col = "entry_name", protein_name_col = "protein_name", gene_name_col = "gene_name" )
data |
A tibble containing at least accession and accession_source columns. |
accession_col |
The column name for accession numbers (default: "accession"). |
accession_source_col |
The column name for accession sources (default: "accession_source"). |
entry_name_col |
The column name for entry names (default: "entry_name"). |
protein_name_col |
The column name for protein names (default: "protein_name"). |
gene_name_col |
The column name for gene names (default: "gene_name"). |
A tibble with UniProt data processed.
# Example usage: # Load necessary library library(tibble) # Reduced example data as an R tibble test_data <- tibble::tibble( id = c(1, 78, 83, 87), species = c("mouse", "mouse", "rat", "mouse"), sample_type = c("brain", "brain", "brain", "brain"), accession = c("O88737", "O35927", "Q9R064", "P51611"), accession_source = c("OtherDB", "UniProt", "UniProt", "UniProt"), entry_name = c("BSN_MOUSE", NA, "GORS2_RAT", NA), protein_name = c("Protein bassoon", NA, "Golgi reassembly-stacking protein2", NA), gene_name = c("Bsn", NA, "Gorasp2", NA) ) # Process the tibble result_data <- process_tibble_uniprot(test_data) # Compare the original and processed tibbles compare_tibbles_uniprot(test_data, result_data)# Example usage: # Load necessary library library(tibble) # Reduced example data as an R tibble test_data <- tibble::tibble( id = c(1, 78, 83, 87), species = c("mouse", "mouse", "rat", "mouse"), sample_type = c("brain", "brain", "brain", "brain"), accession = c("O88737", "O35927", "Q9R064", "P51611"), accession_source = c("OtherDB", "UniProt", "UniProt", "UniProt"), entry_name = c("BSN_MOUSE", NA, "GORS2_RAT", NA), protein_name = c("Protein bassoon", NA, "Golgi reassembly-stacking protein2", NA), gene_name = c("Bsn", NA, "Gorasp2", NA) ) # Process the tibble result_data <- process_tibble_uniprot(test_data) # Compare the original and processed tibbles compare_tibbles_uniprot(test_data, result_data)
Enriches a data frame like process_tibble_uniprot() while avoiding repeated UniProt requests for accessions already present in a local RDS cache.
process_tibble_uniprot_cached( data, cache_path = NULL, accession_col = "accession", accession_source_col = "accession_source", entry_name_col = "entry_name", protein_name_col = "protein_name", gene_name_col = "gene_name" )process_tibble_uniprot_cached( data, cache_path = NULL, accession_col = "accession", accession_source_col = "accession_source", entry_name_col = "entry_name", protein_name_col = "protein_name", gene_name_col = "gene_name" )
data |
A data frame containing accession and accession source columns. |
cache_path |
Optional RDS file path for cached parsed UniProt records. |
accession_col |
The accession column name. |
accession_source_col |
The accession source column name. |
entry_name_col |
The entry name column name. |
protein_name_col |
The protein name column name. |
gene_name_col |
The gene name column name. |
A data frame with UniProt fields filled where available.
This function sends a GET request to the UniProt REST API and retrieves data based on the provided UniProt accession number.
retrieve_uniprot_data(accession)retrieve_uniprot_data(accession)
accession |
A character string representing the UniProt accession number. |
A list containing the retrieved data in JSON format, or NULL if the request fails.
# Example usage result <- retrieve_uniprot_data("O88737") print(result)# Example usage result <- retrieve_uniprot_data("O88737") print(result)
Checks that an Atlas table has the columns needed by the public website and that dataset labels are not mixed accidentally.
validate_atlas_data(data, dataset = NULL)validate_atlas_data(data, dataset = NULL)
data |
A data frame containing Atlas records. |
dataset |
Optional dataset label: |
A list with valid, errors, warnings, and summary.