• About
  • Documentation

  • More Universes
  • Recent Updates
  • Leader board

  • All repositories
  • All packages
  • All articles
  • All datasets
  • All system Libraries
yaoxiangli
  • Builds
  • Packages
  • Articles
  • Datasets
  • Contribution
  • Badges
  • API
  • Feed

Links toyaoxiangli

RJSONIO - Serialize R Objects to JSON

Converts R objects to and from JavaScript Object Notation (JSON). The package provides a stable interface for reading JSON from strings, files, and connections, and for serializing common R objects, including vectors, lists, data frames, arrays, environments, and S4 objects. It also exposes parser handlers, callbacks, and S4 methods for applications that need customized JSON processing while preserving established RJSONIO behavior.

Last updated

cpp

11.40 score 1 stars 62 dependents 1.9k scripts 20k downloads

textreuse - Detect Text Reuse and Document Similarity

Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.

Last updated

peer-reviewedcpp

8.82 score 201 stars 235 scripts 385 downloads

sofa - Connector to 'CouchDB'

Provides an interface to the 'NoSQL' database 'CouchDB' (<https://couchdb.apache.org/>). Methods are provided for managing databases within 'CouchDB', including creating/deleting/updating/transferring, and managing documents within databases. One can connect with a local 'CouchDB' instance, or a remote 'CouchDB' database such as 'IBM Cloudant'. Documents can be inserted directly from vectors, lists, data.frames, and 'JSON'. Targeted at 'CouchDB' v2 or greater.

Last updated

couchdbdatabasenosqldocumentscloudantcouchdb-client

7.64 score 33 stars 55 scripts 577 downloads

medrxivr - Access and Search MedRxiv and BioRxiv Preprint Data

An increasingly important source of health-related bibliographic content are preprints - preliminary versions of research articles that have yet to undergo peer review. The two preprint repositories most relevant to health-related sciences are medRxiv <https://www.medrxiv.org/> and bioRxiv, both of which are operated by the Cold Spring Harbor Laboratory. 'medrxivr' provides programmatic access to the 'Cold Spring Harbour Laboratory (CSHL)' API <https://api.biorxiv.org/>, allowing users to easily download medRxiv and bioRxiv preprint metadata (e.g. title, abstract, publication date, author list, etc) into R. 'medrxivr' also provides functions to search the downloaded preprint records using regular expressions and Boolean logic, as well as helper functions that allow users to export their search results to a .BIB file for easy import to a reference manager and to download the full-text PDFs of preprints matching their search criteria.

Last updated

bibliographic-databasebiorxivevidence-synthesismedrxiv-datapeer-reviewedpreprint-recordssystematic-reviews

7.36 score 62 stars 62 scripts 66 downloads

gendercoder - Recodes Sex/Gender Descriptions into a Standard Set

Provides dictionary-based tools for recoding free-text gender responses into consistent categories while preserving gender diversity where possible. The package standardises spelling, capitalization, whitespace, and common variants through curated named character-vector dictionaries, supports either detailed or collapsed output categories, and can retain original unmatched responses for manual review. It also includes helpers for creating custom dictionaries from approximate string matches and a local interactive application for recoding uploaded data files.

Last updated

gender-diversityozunconf18unconf

6.77 score 47 stars 57 scripts

bpgmm - Bayesian Model Selection Approach for Parsimonious Gaussian Mixture Models

Model-based clustering using Bayesian parsimonious Gaussian mixture models. MCMC (Markov chain Monte Carlo) are used for parameter estimation. The RJMCMC (Reversible-jump Markov chain Monte Carlo) is used for model selection. GREEN et al. (1995) <doi:10.1093/biomet/82.4.711>.

Last updated

armadilloclusteringclustering-algorithmcppmachine-learningmcmcrjmcmcopenblascpp

5.02 score 1 stars 10 scripts 207 downloads

ggvolcano - Publication-Ready Volcano Plots

Provides publication-ready volcano plots for visualizing differential expression results, commonly used in RNA-seq and similar analyses. This tool helps create high-quality visual representations of data using the 'ggplot2' framework Wickham (2016) <doi:10.1007/978-3-319-24277-4>.

Last updated

4.57 score 75 scripts 287 downloads

cmmr - CEU Mass Mediator RESTful API

CEU (CEU San Pablo University) Mass Mediator is an on-line tool for aiding researchers in performing metabolite annotation. 'cmmr' (CEU Mass Mediator RESTful API) allows for programmatic access in R: batch search, batch advanced search, MS/MS (tandem mass spectrometry) search, etc. For more information about the API Endpoint please go to <https://github.com/YaoxiangLi/cmmr>.

Last updated

batch-searchceu-mass-mediatormetablomicsms-search

4.43 score 16 stars 17 scripts 198 downloads

oglcnac - Processing and Analysis of 'O-GlcNAcAtlas' Data

Provides tools for processing and analyzing data from the 'O-GlcNAcAtlas' database <https://oglcnac.org/>, as described in Ma (2021) <doi:10.1093/glycob/cwab003>. It integrates 'UniProt' <https://www.uniprot.org/> API calls to retrieve additional information. It is specifically designed for research workflows involving 'O-GlcNAcAtlas' data, providing a flexible and user-friendly interface for customizing and downloading processed results. Interactive elements allow users to easily adjust parameters and handle various biological datasets.

Last updated

2.70 score 3 scripts 129 downloads

rconf - Minimal and Lightweight Configuration Tool

Minimal and lightweight configuration tool that provides basic support for 'YAML' configuration files without requiring additional package dependencies. It offers a simple method for loading and parsing configuration settings, making it ideal for quick prototypes and lightweight projects.

Last updated

2.70 score 97 downloads

ggpca - Publication-Ready PCA, t-SNE, and UMAP Plots

Provides tools for creating publication-ready dimensionality reduction plots, including Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), and Uniform Manifold Approximation and Projection (UMAP). This package helps visualize high-dimensional data with options for custom labels, density plots, and faceting, using the 'ggplot2' framework Wickham (2016) <doi:10.1007/978-3-319-24277-4>.

Last updated

2.60 score 4 stars 2 scripts 193 downloads