# Eplet Mismatch

library(hlaR)
library(tidyverse)
library(dplyr)

## Aims

In this vignette, we demonstrate single and overall eplet mismatch calculations for both MHCI and MHCII.

## Overview

Eplets corresponding to the donor and recipient alleles are determined using the reference table of choice. Overall Mismatch: Overall mismatch describes the most common eplet implementation - eplets present in the donor alleles that are not in the corresponding recipient’s alleles. From the output of this function, mm_cnt_tt is the total count of mismatched eplets between the recipient and the donor. Single Antigen: Single antigen mismatch assigns the eplet mismatch to the specific donor antigen responsible for the mismatch. If one of the input donor alleles is NA, it will be excluded from the final report.

## Use Cases:

To input retrospective data, submit paired recipient and donor with a matching pair_id. This function can also be used to examine mismatch between a single donor and multiple recipients. To achieve this result, repeat the donor information and match it in pairs with the different possible recipients. The same test can be performed with a single recipient and multiple donors.

### Example: Investigate the Importance of Eplet Mismatch in Retrospective Transplant Cohort

In this example, the transplant team or researcher would like to calculate the Class I and Class II eplet mismatch for a cohort of transplant recipients using high resolution or imputed four digit recipient and donor HLA typing data.

## Example 1 - MHC-I

# use the testing data in the library
dat_mhc1 <-  read.csv(system.file("extdata/example", "MHC_I_test.csv", package = "hlaR"), sep = ",", header = TRUE)
re_mhc1 <- CalEpletMHCI(dat_mhc1)
re_mhc1_single <- re_mhc1$single_detail re_mhc1_overall <- re_mhc1$overall_count
##### Step 3: Calculate MHCII mismatch and DRDQ risk score, pull information from result using single and overall outputs.
# use the testing data in the library
dat_mhc2 <-  read.csv(system.file("extdata/example", "MHC_II_test.csv", package = "hlaR"), sep = ",", header = TRUE)
re_mhc2 <- CalEpletMHCII(dat_mhc2)
re_mhc2_single <- re_mhc2$single_detail re_mhc2_overall <- re_mhc2$overall_count
re_mhc2_drdq_risk <- re_mhc2$dqdr_risk ##### Step 4: Use visuals to examine the data. The first two histograms examine the distribution of MHCI and MHCII eplet mismatch in this population. The second histogram is a visualization of the most frequently mismatched MHC Class I eplets in our transplant cohort. DSA information can also be incorporated to examine the presence of specific eplets in patients who develop DSA. hist(re_mhc1_overall$mm_cnt_tt)

hist(re_mhc2_overall$mm_cnt_tt)  mm_eplets <- strsplit(re_mhc1_single$mm_eplets, split = ",")
mm_eplets <- as.data.frame(matrix(as.factor(unlist(mm_eplets))))
colnames(mm_eplets) <- c("eplets")

count<- mm_eplets %>%
group_by(eplets) %>%
summarize(count=n())

count %>%
arrange(desc(count)) %>%
top_n(10) %>%
ggplot(aes(eplets, count)) +
geom_col()
#> Selecting by count

## Sources

Eplet reference tables are extracted from the HLAMatchMaker workbooks version 2.1 and 3.1 (http://www.epitopes.net/index.html). We have verified the results generated by hlaR in comparison to HLAMatchMaker. There are a small number of typographical errors in the HLAMatchMaker excel files that result in eplets identified in the reference table not carrying over to the results. Because of these errors, the eplet mismatch on hlaR is occasionally one or two points higher than that output by the excel files. If an allele from the input data is not present in the reference table, the result generated will be NA.

## MatchMaker Logic:

The MHC I function examines donor and recipient HLA A,B, and C. Each donor allele (A,B,C) is compared against ALL recipient alleles. The MHC II function compares donor and recipient HLA DR, DP, and DQ. Please note that DRBw is an alternate naming of DRB3,4,5. Mismatch comparisons for MHCII are slightly more complicated (logic extracted from HLAMatchMaker). For DRB+DRw, DQA,DQB, and DPA, the donor alleles are compared to the corresponding recipient alleles only (i.e. DRB+DRw compared to DRB+DRw, DQB compared to DQB). DPB is managed in two ways depending on if the eplets are interloci. If interloci, they are compared across ALL of the recipients B alleles (DQB, DRB+DRw, DPB). If not, they are compared only to the recipients DPB alleles.