THE STATE OF SOUTH CAROLINA In The Supreme Court
In the Matter of the Care and Treatment of Andy Eugene Hyman, Respondent.
Appellate Case No. 2024-001781
ON WRIT OF CERTIORARI TO THE COURT OF APPEALS
Appeal from Florence County Roger E. Henderson, Circuit Court Judge
Opinion No. 28330 Heard December 16, 2025 – Filed May 13, 2026
AFFIRMED
Attorney General Alan McCrory Wilson and Assistant Attorney General Christopher Runyan, both of Columbia, for Petitioner.
Senior Appellate Defender Lara Mary Caudy, of Columbia, for Respondent.
David Allen Chaney, Jr. and Meredith Dyer McPhail, both of Columbia, as Amicus Curiae for The American Civil Liberties Union of South Carolina.
CHIEF JUSTICE KITTREDGE: In sexually violent predator commitment proceedings, the South Carolina Office of Mental Health (OMH) is statutorily required to perform a pre-commitment evaluation. See S.C. Code Ann. § 44-48-80(D) (2018). Here, OMH—the State's usual expert—determined that Respondent Andy Hyman was not a sexually violent predator (SVP). In response, the State turned to the Sexual Behavior Clinic and Lab at the Medical University of South Carolina (MUSC) for a second opinion. See S.C. Code Ann. § 44-48-80(D), -90(C) (2018) (permitting the party dissatisfied with the results of OMH's initial evaluation to seek a second opinion). MUSC opined that Hyman was an SVP. The difference in the two conflicting expert opinions largely boiled down to the use of the controversial penile plethysmography test (PPG): MUSC regularly performs the test and uses its results to support classifying an offender as an SVP, whereas OMH rejects any notion that the PPG is a valid or reliable assessment for SVP pre-commitment evaluations.
Broadly speaking, the PPG is an attempt to objectively quantify an inherently subjective experience: male sexual arousal. Hyman sought to exclude the PPG evidence in his case, arguing PPGs in general were fundamentally unreliable and lacked the level of standardization requisite in scientific testing. The trial court rejected Hyman's arguments, finding the PPG was a reliable scientific test that objectively quantified Hyman's arousal. The jury found Hyman was an SVP.
On appeal, the court of appeals reversed, holding in accordance with the clear majority rule that the trial court abused its discretion when it found the PPG constituted reliable scientific evidence. Because of the widespread controversy currently surrounding the use of the PPG as an assessment tool for SVP purposes, we granted the State's petition for a writ of certiorari to review the court of appeals' decision.
The PPG procedures employed here by MUSC highlight the glaring lack of standardization prevalent in both the protocols for the test and the analysis of the results—a problem that necessarily prevents any finding of reliability. We hold that PPG results are generally inadmissible in judicial proceedings unless and until the underlying science is more thoroughly developed, thus creating a more structurally sound path for a court to find PPG results are reliable and admissible. Our holding today follows the overwhelming majority rule. As a result, we affirm the court of appeals' decision reversing the trial court and remand the matter for a new commitment proceeding.
I.
To contextualize the facts of this case, we first describe the PPG in more detail. The PPG is designed to measure a man's sexual response to a series of visual and auditory stimuli, with the theoretical goal of objectively discerning whether a man has deviant sexual interests. The procedure—which some courts have described as Orwellian 1 and others as "especially unpleasant and offensive" 2—requires a man to sexually stimulate himself before placing a mercury strain gauge around his penis while engorged, allow his penis to detumesce, and then watch and listen to various erotic and non-erotic stimuli. For the next several hours, an observer records changes in the man's penile circumference after the presentation of each stimulus. The PPG is the subject of intense controversy in the scientific community, with the debate centering around whether the PPG can reliably diagnose deviant sexual arousal, as its proponents claim. Many experts opposed to the PPG have expressed concerns that no universal standards exist for administering the PPG or interpreting the test results. See generally Jason R. Odeshoo, Of Penology and Perversity: The Use of Penile Plethysmography on Convicted Child Sex Offenders, 14 Temp. Pol. & Civ. Rts. L. Rev. 1, 12–13 (2004) ("Finally, and more troublingly, the procedures for administering PPG tests have yet to be standardized. From one facility to the next, important variables may differ, including the type of stimuli used, the content, duration, and interval between presentations, the types of instructions given to subjects, the type of equipment used, as well as how responses are counted. Manufacturers of PPG devices provide instructions and protocols, but these do not appear to be consistently followed, and their manner of use differs from one facility or researcher to the next. The [Association for the Treatment of Sexual Abusers] has issued guidelines for PPG administration, but they are of a very general nature and, again, do not appear to be followed with any regularity. . . . At the present time, . . .virtually all researchers agree that [the] PPG's current lack of standardization is unacceptable. That fact alone casts serious doubt on any of the positive findings . . . concerning [the] PPG's reliability and validity." (emphasis added) (footnotes omitted)). The lack of standardization, however, is not the scientific community's only hesitancy. Indeed, some experts have focused on the high rate—around twenty percent—of false positives and false negatives associated with men's ability to willfully suppress or display arousal. 3 Other experts have noted there can be
1 United States v. Weber, 451 F.3d 552, 554 (9th Cir. 2006). 2 Berthiaume v. Caron, 142 F.3d 12, 16 (1st Cir. 1998). 3 An SVP commitment proceeding imposes upon the State the highest burden of proof available. S.C. Code Ann. § 44-48-100(A) (2018) ("The court or jury must determine whether, beyond a reasonable doubt, the person is a sexually violent significant differences between the results of an offender's initial PPG and a subsequent retest administered several months later. The disparities may be explained, at least in part, by the PPG's inability to account for a host of variables that affect erectile responses, including, for example, the recency of an offender's last orgasm, his level of intoxication or fatigue, his cardiovascular health, or his current medications. Moreover, the PPG does not factor in other variables that have a less obvious, but no less discernable, impact on erectile responses, including:
1. The age of the offender (as men age, they are less likely to display deviant erectile responses);
2. Intelligence (offenders with lower IQs are more likely to appear deviant than those with higher IQs); 3. The time of year the test is performed (because testosterone levels, which are believed to affect sexual responding, undergo seasonal fluctuations); or 4. The gender of the clinician administering the test (at least one study has found arousal levels may be higher when a female clinician administers the test). Notably, this list of concerns is non-exhaustive.4 As a result, the scientific community sharply disagrees on whether the PPG should be used as a diagnostic tool during pre-commitment evaluations. 5
With that background in mind, we now turn to the facts of this case.
predator." (emphasis added)). Thus, it is necessarily troubling for the State to introduce to the factfinder evidence with a significant degree of false positive or false negative results. 4 Indeed, the manner in which the PPG was administered to Hyman reveals further reliability concerns associated with its results. 5 Interestingly, while the scientific community disagrees about the utility of the PPG as a diagnostic tool, it agrees the PPG is a reliable treatment tool for sexual offenders. More specifically, the PPG provides treating clinicians a starting point to open a dialogue with patients regarding their sexual attraction. II.
A. In 1997, Hyman pleaded guilty to criminal sexual conduct with a minor (CSCM) in the second degree and lewd act on a minor. He was sentenced under the Youthful Offender Act, served a short term in prison, and completed several years of supervised release in 2003. Subsequently, in 2016, Hyman again pled guilty to CSCM, this time in the third degree, and was sentenced to ten years' imprisonment. Before his release from prison the second time, the State began civil commitment proceedings under the SVP Act.6
In accordance with the Act, Hyman was referred to OMH for a pre-commitment evaluation, which was performed by Dr. Marie Gehle, OMH's chief psychologist. See S.C. Code Ann. § 44-48-80(D). After conducting a series of standardized tests, Dr. Gehle diagnosed Hyman with pedophilic disorder. However, she concluded Hyman did not meet the statutory definition of an SVP because, in her opinion, he did not pose a heightened risk of reoffending, as the Act requires. See generally id. § 44-48-30(1) (defining an SVP as a person convicted of a qualifying sexually violent offense who "suffers from a mental abnormality or personality disorder that makes the person likely to engage in acts of sexual violence if not confined in a secure facility for long-term control, care, and treatment" (emphasis added)). Consistent with OMH's standard practices, Dr. Gehle did not perform a PPG on Hyman in reaching her conclusion. After Dr. Gehle submitted her report to the court, the State sought a second opinion from Dr. Emily Gottfried, the director of the Sexual Behavior Clinic and Lab at MUSC. See S.C. Code Ann. §§ 44-48-80(D), -90(C). Dr. Gottfried performed many of the same tests as Dr. Gehle, but she also followed her own standard practice of having Hyman undergo a PPG. Like Dr. Gehle, Dr. Gottfried diagnosed Hyman with pedophilic disorder. However, unlike Dr. Gehle, Dr. Gottfried concluded Hyman did qualify as an SVP under the Act because she believed he posed a heightened risk of reoffending.
B. Prior to the start of the commitment trial, Hyman made a motion in limine to exclude the results of his PPG and prohibit Dr. Gottfried from testifying about the test at trial. Hyman argued that the science underlying the PPG was unreliable, specifically 6 See S.C. Code Ann. §§ 44-48-10 to -180 (2018). pointing to (1) the lack of standardization in administering and interpreting the test and (2) the lack of peer-reviewed studies regarding the validity of Dr. Gottfried's chosen stimuli sets. He also contended the jury was likely to misuse the PPG results in reaching its verdict, asserting the jury would "grab onto the[] results from the PPG . . . to the exclusion of any other information and . . . convict him based on the PPG" alone. The State opposed the motion, claiming the PPG was "widely recognized" and "a standard objective measure of arousal . . . essential in the assessment and treatment of male sex offenders and men with paraphilic interests." The trial court allowed the parties to proffer testimony in support of their positions. Hyman called Dr. Gehle.7 Dr. Gehle testified that standardization in psychological test protocols and scoring is crucial. Without standardization, she explained, there is no means of knowing whether the results of a test on one person are comparable to the results of a test on a different person, or whether the results of a test on one person are comparable to the results of a separately conducted test on that same person. Dr. Gehle claimed it was for just this reason that, despite being the default court-appointed evaluator in every SVP commitment case, OMH never uses PPGs in pre-commitment evaluations.8 According to Dr. Gehle, OMH's reluctance to use the PPG as a diagnostic tool stems from the test's inherent lack of both reliability and validity. 9
As to reliability, Dr. Gehle asserted the "test/retest reliability [for the PPG] is very poor," and test subjects can (and do) get wildly different results if they are given multiple PPGs several months apart, to the point that even an expert cannot compare the test results and reach a conclusion. Dr. Gehle testified a PPG could nonetheless
7 Dr. Gehle testified she had almost exclusively conducted pre-commitment SVP evaluations for the last eleven years, resulting in her handling approximately 300 SVP evaluations. 8 Because OMH rejects the PPG, the State argued Dr. Gehle did not have sufficient knowledge regarding the intricacies of the test to offer an opinion as to its reliability. However, Dr. Gehle explained that she had researched the test extensively, discussing by name or author several books and articles published as recent as the then-current year. 9 As Dr. Gehle testified, the term "reliability" as used in psychology refers to whether a test subject can take a standardized test multiple times and get similar results. In contrast, the term "validity" refers to whether a particular standardized test is actually measuring what it purports to measure. be useful in the treatment setting because it could be used to start a conversation with an offender about his arousal, and any errors in the test would occur in a low-stakes setting with minimal consequences. However, Dr. Gehle opined that the outcome of a pre-commitment evaluation is far more significant, and therefore, the errors prevalent in PPGs would have much more severe consequences if they occurred in that setting.
Likewise, concerning the validity of PPGs, Dr. Gehle explained MUSC essentially conducts two PPGs in one because it administers two different sets of stimuli back- to-back—more than doubling the length of a typical PPG. Dr. Gehle emphasized the high degree of inconsistency across PPG laboratories, noting each individual laboratory in the United States can choose which stimulus set—or even portion of a stimulus set—to use during a PPG. Given the lack of empirical research on the effect of using different stimuli sets back-to-back in the same PPG, Dr. Gehle asserted it was impossible to say if MUSC was in fact measuring the type of arousal it claimed the PPG results showed. In the end, Dr. Gehle concluded from her assessment of peer reviewed publications that the PPG lacked the kind of reliability and validity inherent in scientific testing and, thus, should not be used "in such a high-stakes evaluation."
In response to Dr. Gehle, the State called Dr. Gottfried to testify as to her view of the reliability of the PPG.10 Dr. Gottfried explained she regularly conducts a PPG along with other psychological tests as part of her pre-commitment evaluations. After explaining the PPG procedure, Dr. Gottfried discussed the quality control measures employed by MUSC, emphasizing that every test performed at MUSC is done in the same way. For example, she asserted MUSC controls the humidity and temperature in the testing room and repeatedly calibrates the gauges before use. Dr. Gottfried additionally explained that MUSC employs countermeasures to ensure test subjects are paying attention and not manufacturing or suppressing arousal.
Dr. Gottfried confirmed each PPG laboratory—both nationally and internationally—
10 Dr. Gottfried testified that she had been at MUSC for six years, and in those six years, the State had asked her for a second opinion in seventeen pre-commitment evaluations—the total number of pre-commitment evaluations in which she had participated. In those seventeen cases, Dr. Gehle found the offenders did not qualify as an SVP because she believed they did not pose a heightened risk to reoffend. Dr. Gottfried disagreed with Dr. Gehle in more than half of the evaluations (ten of the seventeen), opining the offenders did pose a heightened risk to reoffend largely based on the results of the PPGs. uses a different "cut score" to measure when an increase in penile circumference becomes statistically significant enough to constitute arousal. 11 Dr. Gottfried acknowledged there are some limitations to the PPG, including, for example, that "there's really no way of knowing" whether a man's reaction to a particular scenario in a stimulus set was a false positive. It was for that reason that Dr. Gottfried purported to use the PPG results as a single data point in her diagnosis rather than the sole data point.
Dr. Gottfried also testified that despite past criticism, the PPG was increasingly accepted by the scientific community. For example, Dr. Gottfried testified the Diagnostic and Statistical Manual, Fifth Edition (DSM-5) mentions the PPG is occasionally used to compare the strength of deviant and normophilic sexual urges. Likewise, the DSM-5 states the PPG is the "most thoroughly researched and longest used" psychophysiological measure of sexual interest, "although the sensitivity and specificity of diagnosis may vary from one [laboratory] to another." 12 Dr. Gottfried also asserted that the Association for the Treatment of Sexual Abusers "supports the responsible use of the PPG" in both pre-commitment evaluations and treatment. Dr. Gottfried defined "responsible use" as meaning a PPG laboratory would interpret the results in a standardized manner and use the results as a single data point among many in its ultimate decision. Likewise, Dr. Gottfried testified that a 2019 meta-analysis examined the data used in "thirty to fifty [PPG] studies with [] over 10,000 men" and "peered" many of the issues prevalent in PPG research, although Dr. Gottfried did not specify exactly what was peered or how the meta-analysis resolved particular concerns. Dr. Gottfried acknowledged, however, that the meta-analysis author did not know what type of stimuli sets were used to generate the data he analyzed because study authors typically do not disclose that information in their publications. Dr. Gottfried conceded it was highly unlikely every participant in the studies was exposed to the same stimuli sets that MUSC uses.
Dr. Gottfried concluded her testimony by stating that research consistently shows the PPG is "more reliable" in detecting pedophiles than in detecting other kinds of sexual offenders, even though not all men who offend against children are sexually attracted to them. Thus, in Dr. Gottfried's opinion, the PPG is "a reliable indicator of pedophilic interest in sexually violent predator evaluations," and PPG results are
11 There is no independent agency that certifies or oversees PPG laboratories. 12 The DSM-5 does not refer to any particular stimulus set that a PPG must employ. "a useful data point," particularly when a test subject is "not forthcoming" about his pedophilic urges.
Following the experts' testimony, the trial court denied Hyman's motion in limine, finding the probative value of Hyman's PPG outweighed any prejudicial effect.
C.
At trial, Dr. Gottfried testified first on behalf of the State. She began by discussing the results of the various tests she performed on Hyman, including the Static-99R, Static-2002R, MMPI, 13 SASSI, 14 PAI, 15 MIDSA, 16 Hare Psychopathy Test, and SVR-20.17 Dr. Gottfried testified the results of the Static-99R and Static-2002R placed Hyman "squarely within the average rate of recidivism" for sex offenders. Likewise, the remaining test results revealed Hyman: (1) did not suffer from substance use disorder; (2) was not a psychopath; (3) was "reluctant to admit to having undesirable negative reactions on how he responds to things"; (4) may have responded to some of the questions to portray himself in an overly favorable light; (5) was currently feeling stress; and (6) reported some persecutory thoughts.
As for Hyman's PPG results, much of Dr. Gottfried's trial testimony was identical to her testimony at the hearing on the motion in limine, with a few notable differences. For instance, Dr. Gottfried informed the jury that the PPG is "an objective physiological measure of male sexual arousal," "the gold standard of looking at males['] sexual arousal," and a "strong predictor or risk factor for future sexual offending." Dr. Gottfried explained an "objective" PPG was particularly useful in pre-commitment evaluations where the offenders "have understandable motivation" to lie about their sexual predilections. Thus, Dr. Gottfried stated that rather than relying on what a suspected SVP told her, she "want[ed] to try to objectively measure things." Dr. Gottfried noted Hyman's PPG results were consistent with her diagnosis.
13 Minnesota Multiphasic Inventory 14 Substance Abuse Subtle Screening Inventory 15 Personality Assessment Inventory 16 Multidimensional Inventory of Development, Sex, and Aggression 17 Sexual Violence Risk-20 After the State rested, Hyman moved for a directed verdict, arguing the State had not proven he was likely to reoffend. According to Hyman, the "only concrete information" regarding his likelihood of reoffending came from the Static-99R and Static-2002R results, which indicated that he was at average risk to reoffend. In response, the State focused almost exclusively on Hyman's PPG results, claiming the PPG "proved" he was a current danger to the public. The trial court denied Hyman's motion for a directed verdict.
Hyman then called Dr. Gehle, who discussed all of the assessments used by Dr. Gottfried and explained why each of them was either inappropriate or unnecessary to use in Hyman's evaluation.18 In fact, Dr. Gehle testified that, aside from the Static- 99R and the Static-2002R, none of the assessments used by Dr. Gottfried were designed to estimate the likelihood of recidivism. Additionally, Dr. Gehle stated that Hyman's scores on the Static-99R and the Static-2002R were the same scores as that of "the average sex offender," which meant Hyman was not at a heightened risk to reoffend. Like her counterpart, Dr. Gehle's statements regarding the PPG largely mirrored her pre-trial testimony, although she added that she was not surprised by Hyman's PPG results given her diagnosis of pedophilic disorder.
During closing arguments, the State emphasized that the PPG results "clearly indicate[d] that [Hyman] has current sexual interest in children," and "that, in and of itself, is enough to put him in a secured facility for long term care[,] control and treatment." Ultimately, the jury deliberated for a mere twenty-two minutes before finding Hyman qualified as an SVP and should be civilly committed. D.
Hyman appealed, and the court of appeals reversed in an unpublished opinion. In re Care & Treatment of Hyman, Op. No. 2024-UP-271 (S.C. Ct. App. filed July 24,
18 For example, for the SASSI, she explained she did not administer that test because there was no indication that Hyman suffered from a substance use disorder. Similarly, for the MMPI, she explained that she sometimes uses that test in her evaluations, but only "when [she has] questions about somebody's personality and it's been difficult in the interview to get a clear assessment of their personality and come to a conclusion as to whether they have a personality disorder and what type of personality disorder it is"—a difficulty she did not encounter when interviewing Hyman. Interestingly, Dr. Gehle stated that neither she nor any of her colleagues at OMH had ever even heard of the MIDSA, much less performed it during a pre- commitment evaluation. 2024). The court of appeals held the PPG was not reliable as required by Rule 702, SCRE, and State v. Council,19 and therefore, the trial court abused its discretion in admitting Hyman's PPG results. The court of appeals additionally found the trial court's error was not harmless and remanded the case for a new commitment trial under the SVP Act.
We granted the State's petition for a writ of certiorari to review the court of appeals' decision.
III.
The standard of review for evidentiary rulings is typically very deferential. In re Care & Treatment of Bilton, 432 S.C. 157, 161, 851 S.E.2d 442, 444 (Ct. App. 2020). "The admission or exclusion of evidence is a matter within the trial court's sound discretion, and an appellate court may only disturb a ruling admitting or excluding evidence upon a showing of a manifest abuse of discretion accompanied by probable prejudice." State v. Commander, 396 S.C. 254, 262–63, 721 S.E.2d 413, 417 (2011) (cleaned up); Vaught v. A.O. Hardee & Sons, Inc., 366 S.C. 475, 480, 623 S.E.2d 373, 375 (2005) ("To warrant reversal based on the admission or exclusion of evidence, the appellant must prove the error of the ruling and the resulting prejudice, i.e., there is a reasonable probability the jury's verdict was influenced by the wrongly admitted or excluded evidence."). "An abuse of discretion occurs when the conclusions of the trial court either lack evidentiary support or are controlled by an error of law." State v. Pagan, 369 S.C. 201, 208, 631 S.E.2d 262, 265 (2006). IV. The admissibility of expert scientific testimony is guided by Rule 702, SCRE, which provides: "If scientific, technical, or other specialized knowledge will assist the trier of fact to understand the evidence or to determine a fact in issue, a witness qualified as an expert by knowledge, skill, experience, training, or education, may testify thereto in the form of an opinion or otherwise." When admitting scientific evidence under Rule 702, SCRE, the trial judge must find the evidence will assist the trier of fact, the expert witness is qualified, and the underlying science is reliable. Council, 335 S.C. at 20, 515 S.E.2d at 518. To determine reliability, the trial judge must examine the scientific evidence in light of "(1) the publications and peer review of the technique; (2) prior application of the method to the type of evidence involved in the case; (3) the quality control procedures used to ensure reliability; and (4) the
19 335 S.C. 1, 515 S.E.2d 508 (1999). consistency of the method with recognized scientific laws and procedures." Id. at 19, 515 S.E.2d at 517. Finally, even if the evidence is admissible under Rule 702, the trial judge must determine if the probative value of the evidence is substantially outweighed by the danger of unfair prejudice. See Rule 403, SCRE.
A. We agree with the court of appeals' finding that the trial court erred in determining Hyman's PPG results were reliable and, therefore, admissible. The testimony from both Drs. Gehle and Gottfried make it clear that at least three of the four reliability factors listed above—1, 2, and 4—will never weigh in favor of finding the PPG reliable, at least in its current form.
Factor #1: Publications and Peer Review of the PPG
Looking first at the publications and peer review of the PPG, the scientific community appears polarized as to whether to recognize the PPG as a diagnostic tool for assessing sexual deviancy. Indeed, while some experts have found the PPG inherently unreliable—based, in part, on the lack of standardization and high error rate (upwards of twenty percent)—others have concluded the opposite and found the PPG to be an accurate and reliable diagnostic tool. Compare, e.g., Odeshoo, supra, at 9–13 (collecting cases and publications concluding the PPG is unreliable and should not be used as a diagnostic tool), with J.G. Barker & R.J. Howell, The Plethysmograph: A Review of Recent Literature, 20 Bull. Am. Acad. Psychiatry & Law 13, 13 (1992) (stating the PPG "is a reliable and valid method" and "the best objective measure of male sexual arousal"). Notably, however, even the experts who believe the PPG to be a valuable tool in pre-commitment evaluations acknowledge the test is not standardized.20 In fact, the DSM-5—which Dr. Gottfried and the State
20 For example, Drs. Gregg Dwyer and William Burke are both heavy proponents of the PPG. Nonetheless, both have observed PPGs are not standardized. See, e.g., Lisa Murphy, Rebekah Ranger, J. Paul Fedoroff, Hannah Stewart, R. Gregg Dwyer & William Burke, Standardization of Penile Plethysmography Testing in Assessment of Problematic Sexual Interests, 12 J. Sex. Med. 1853, 1853–54 (Sept. 2015) ("Wide variation exists concerning stimuli types, assessment protocols and means of analyzing and interpreting phallometric results in forensic laboratories in North America. Concerns regarding the lack of standardization in phallometry across sites have been discussed since its creation, however, little improvement has been made. There are challenges in the implementation of standardization within jurisdictions and between countries."). The State also cites Dr. Joseph Plaud as a proponent of the PPG. He has similarly observed the PPG is not standardized. See, cited repeatedly before and during trial—recognizes the lack of standardization, noting that although a PPG is "sometimes" used in assessing paraphilic sexual disorders and is the "most thoroughly researched and longest used," "the sensitivity and specificity of diagnosis may vary from one [laboratory] to another." (Emphasis added). Likewise, the publications in support of the PPG may be less helpful than they appear. As Dr. Gottfried acknowledged in her testimony, PPG studies often do not specify which stimulus set(s) were used during testing. That omission makes it difficult to compare results across studies or to replicate a particular study's results in a subsequent study.
Because of the significant schism in the scientific community over the reliability of the PPG, we find this factor does not weigh in favor of finding the PPG reliable or admissible. Factor #2: Prior Use of the PPG in Pre-Commitment Evaluations
Turning to the prior use of the PPG as a diagnostic tool, we begin by noting OMH refuses to use the PPG as part of its pre-commitment evaluations because of its concerns with the test's reliability and validity. This point carries considerable weight given that OMH is the designated court-appointed evaluator in every SVP commitment proceeding.21 As a result, every SVP proceeding in this state includes at least one expert who determines an examinee's likelihood to reoffend without making use of the PPG. 22 Moreover, the larger scientific community disagrees as to whether the PPG is a useful tool in pre-commitment evaluations due to the test's lack of standardization, validity, and reliability. Accordingly, we find this factor does
e.g., Joseph J. Plaud, The Use of Penile Plethysmography in SVP Assessment and Treatment Decision Making, Sexually Violent Predators: A Clinical Science Handbook 243–54 (O'Donohue & Bromberg, eds.) (2019) ("[E]ven at this time[,] there is no 'standardized' assessment protocol when it comes to PPG administration."). In fact, according to Dr. Plaud, "Civil commitment of sexual offenders should never be based upon PPG findings in any context." Id. at 253. 21 See S.C. Code Ann. § 44-48-80(D). 22 The State is seemingly content with OMH performing these evaluations without a PPG—until it receives a decision from OMH with which it disagrees. At that point, the State primarily (or, perhaps, solely) seeks a second opinion from an evaluator who will perform a PPG. not weigh in favor of finding the PPG reliable or admissible.
Factor #3: Quality Control Procedures As to the quality control procedures used to ensure the PPG's reliability, Dr. Gottfried testified extensively about the measures employed by MUSC, including controlling the humidity and temperature in the testing room and repeatedly calibrating the gauges. Dr. Gottfried also discussed at length the countermeasures MUSC employs to ensure a test subject is paying attention and not manufacturing or suppressing arousal.
While these quality-control procedures are admirable, there is no evidence in the record to suggest they are standard practice across other PPG laboratories. For that reason, it is perhaps unsurprising that we have seen no research proving these countermeasures are effective. Thus, while we applaud MUSC's efforts to bolster the reliability of its own PPGs, this factor is at best a neutral factor in terms of finding the PPG is a reliable scientific test. Cf. State v. Chavis, 412 S.C. 101, 108, 771 S.E.2d 336, 339 (2015) ("[E]vidence of mere procedural consistency does not ensure reliability without some evidence demonstrating that the individual expert is able to draw reliable results from the procedures . . . which he or she consistently applies.").
Factor #4: Consistency of the PPG with Recognized Scientific Practices
Finally, it is in examining the consistency of the PPG with recognized scientific laws and procedures that the PPG's complete absence of reliability becomes most apparent. As Dr. Gehle explained during her testimony, standardization is one of the most crucial components of scientific study. Following uniform procedures ensures reliability and enables subsequent researchers to reproduce experiments and compare results, thereby validating research findings and improving the accuracy of the data.
Where PPG testing is concerned, however, there are at least seventeen aspects— some minor, and some major—in which tests can vary from laboratory to laboratory. See Max B. Bernstein, Supervised Release, Sex-Offender Treatment Programs, and Substantive Due Process, 85 Fordham L. Rev. 261, 274 (2016) (quoting D. Richard Laws, Penile Plethysmography: Will We Ever Get It Right?, in Sexual Deviance: Issues & Controversies 87–88 (Tony Ward, et al. eds., 2003)). Many of those variables concern key features of PPG administration, including the type of stimuli used and the data sampling rate. 23 Although scientists may eventually agree on a
23 See Bernstein, supra, at 274 ("Without training and without standard procedural uniform way to administer and score PPGs, the current lack of standardization makes PPGs inconsistent with generally recognized scientific practices. We highlight only two of the variables discussed by Drs. Gehle and Gottfried, although there are certainly others.
First, each PPG laboratory is allowed to (and does) set its own cut score for determining when a measured increase in penile circumference constitutes statistically significant arousal. In particular, Dr. Gottfried testified that the PPG literature indicates a 2.5-millimeter increase in circumference is the threshold for finding true, physiological arousal. Nonetheless, in interpreting PPG results, MUSC sets its own cut score at 5 millimeters to reduce the chance of a false positive. Thus, MUSC increases the recommended cut score by a factor of two, presumably because it is concerned that the 2.5-millimeter threshold is not sufficiently accurate to identify the presence of true sexual arousal. Importantly, however, MUSC's selected cut score is arbitrary; no studies that we know of indicate a 5-millimeter cut score leads to fewer false positives than a 2.5-millimeter cut score. While MUSC's cut score is more conservative than scores used in some other laboratories, Dr. Gottfried did not explain why MUSC had chosen a 5-millimeter threshold instead of, say, 4 millimeters, 7 millimeters, or 10 millimeters. Additionally, because the cut scores for PPGs are not nationally or internationally standardized, nothing stops MUSC or any other laboratory from arbitrarily increasing or decreasing its cut score to a different, equally random threshold in the future. Second, each PPG laboratory is allowed to (and does) use different stimuli sets, or
guidelines, the following aspects of PPG testing vary greatly from center to center: '(1) Type of gauge used (mechanical, mercury) and transducer placement; (2) Type of stimuli used (audiotapes, slides, videotapes); (3) Content of stimuli used (differences in models); (4) Duration of stimulus presentation (2 sec to > 4 min); (5) Length of interstimulus (detumescence) intervals (fixed time vs. return to baseline) (6) Nature of stimulus categories sampled; (7) Number of categories and of stimuli used for each category; (8) Instructions to subjects (imagine sexual behavior with target vs. no instructions); (9) Whether a warm-up [(i.e., masturbation)] was used and number of assessment sessions; (10) Type of recording instrumentation used; (11) Whether calibration was used to correct for any nonlinear characteristics of recording; (12) Data sampling rate (every 5 sec vs. every min); (13) Whether methods were used to attempt to assess for faking; (14) Gender and other characteristics of the evaluator; (15) Type of data transformation (z-score vs. a deviance index); (16) Characteristics of the laboratory; and (17) Type of sample and setting (outpatient, prison).'" (cleaned up) (quoting Laws, supra, at 87–88)). even portions of stimuli sets, in administering the test. According to Dr. Gottfried, most PPG laboratories present a man with a single set of stimuli over the course of several hours. However, MUSC apparently conducts two PPGs back-to-back using a different stimulus set in each, thereby doubling the length of the typical test. Problematically, the impact of MUSC's unusual, protracted testing has not been studied by the scientific community. Moreover, the fact that MUSC feels the need to subject an examinee to multiple stimuli sets undermines any claim that either set, taken on its own, will produce reliable results. Indeed, Dr. Gottfried acknowledged as much when she stated that if a PPG produced a "valid" result at all, one of the stimuli sets used by MUSC was "more likely" to produce that result than the other.24 Equally troublesome is the fact that MUSC has begun tailoring its two stimuli sets to administer only the scenarios more consistent with the test subject's prior offenses. As Dr. Gehle alluded to, the lack of standardization in administering and scoring the PPG seemingly gives rise to the possibility (or, more likely, probability) that an examinee could be sent to two different laboratories and get two different results based purely on the laboratories' variable and unregulated use of different protocols or stimuli sets. This is wholly inconsistent with recognized scientific practices, preventing any possible finding that the PPG is reliable scientific evidence. Conclusion
In sum, the one constant from our review of the PPG literature is that the entire scientific community—including Dr. Gottfried and other proponents of the PPG— agrees the test is not standardized, both in terms of its administration and interpretation of test results. That lack of standardization is problematic for several reasons, the most prominent of which is that it renders the test results inherently unreliable. See also, e.g., Odeshoo, supra, at 12–13 (making this same observation by collecting a number of professional resources). Even the scholarly publications on the PPG prove unhelpful as there is no way to tell if a particular PPG laboratory follows the same practices that were used in a given research study. Moreover, the presence of standardization in the future will only go so far to ensure reliability given that there are a number of variables for which the PPG cannot account, including age, intelligence, fatigue, intoxication, cardiovascular health, and recency of orgasm, among others. We therefore hold that, for now, PPG results are not sufficiently
24 This is not to mention the fact that not all stimuli sets use the same modality: some use audio narratives only, others use static images in conjunction with the audio narratives, and still others use videos. reliable and, thus, are inadmissible under Rule 702, SCRE, and Council. 25
B. Even were we to find the PPG was sufficiently reliable to satisfy Rule 702, SCRE, and Council, we nonetheless would hold Hyman's PPG results were inadmissible under Rule 403, SCRE. PPGs are incredibly intimate physiological tests that attempt to quantify a fundamentally subjective experience: sexual arousal. By measuring minute changes in penile circumference in response to stimuli, a PPG device converts a private, allegedly involuntary physiological reaction into numerical data that can be analyzed in a quasi-scientific manner. In doing so, the test clothes the results of a subjective experience in supposed scientific expertise, thereby giving those results an aura of objectivity that ordinary testimony lacks.
There are two natural consequences to this veneer of scientific expertise. First, PPG results become particularly powerful pieces of evidence that are difficult to disregard
25 Our decision comports with the overwhelming majority of jurisdictions that have similarly examined the reliability of the PPG, many of which we have listed in the Appendix to this opinion below. While there are three jurisdictions that find PPG results admissible in court—Washington, Illinois, and Florida—all three have laws clearly distinguishable from South Carolina. More specifically, unlike South Carolina, Washington explicitly permits the introduction of PPG results by statute. See Wash. Rev. Code Ann. § 71.09.050(1) (2026). Likewise, Illinois and Florida still follow the Frye standard. See Frye v. United States, 293 F. 1013, 1014 (D.C. Cir. 1923) (holding scientific testimony was reliable and, therefore, admissible when it had attained general acceptance in the scientific community), superseded by Fed. R. Evid. 702 & 703. This Court never adopted the Frye test and, instead, follows Rules 702 and 703, SCRE. Compare, e.g., State v. Fullwood, 22 So. 3d 655, 656– 57 (Fla. Dist. Ct. App. 2009) (declining to assess the reliability of PPG testing under Frye), and In re Commitment of Sandry, 857 N.E.2d 295, 308 (Ill. App. Ct. 2006) ("In this state, our supreme court has made it clear that an inquiry into the validity or reliability of a methodology is outside the scope of the Frye test."), with Bilton, 432 S.C. at 165–66, 851 S.E.2d at 446 (distinguishing Sandry from South Carolina law in thorough fashion by explaining, "Critically, Illinois courts do not examine reliability before 'scientific' evidence is admitted. Reliability is one of the three things a South Carolina court must assess before an expert's testimony is admitted. That same Illinois case also said—explicitly—that it was not holding PPG test results could properly be disclosed to the fact finder. Disclosing the test results to the fact finder is, of course, precisely what happened here." (cleaned up)). or meaningfully discount to a factfinder. 26 Second, and relatedly, the facade of objectivity diminishes the factfinder's role in making credibility determinations between competing experts, or between an expert and a suspected SVP. Perhaps both of these consequences would not amount to unfair prejudice within the meaning of Rule 403 were it not for one thing: the dubious reliability of the "scientific" evidence. However, given the questionable probative value of PPG results, we find any introduction of PPG results at this time has a natural tendency to confuse and mislead the jury. Accordingly, we hold any probative value of Hyman's PPG results was substantially outweighed by the danger of unfair prejudice, and those results were therefore inadmissible under Rule 403, SCRE. See State v. Floyd Y., 2 N.E.3d 204, 211 (N.Y. 2013) ("However, [New York's SVP Act, known as] article 10[,] does not explicitly limit the hearsay testimony of experts [such as PPG results] even though it essentially envisions a 'battle of the experts' to determine whether the respondent has a mental abnormality. In many article 10 trials, expert testimony may be the only thing a jury hears. Experts enter upon the jury's province, since the expert—and not the jury—draws conclusions from the facts, and there is a correspondingly high risk that jurors will rely on unreliable material only because it was introduced by an expert. Moreover, article 10 trials inevitably involve devastating accusations. At a minimum, each and every article 10 respondent has been convicted of a sex crime. . . . [T]he facts can involve horrible offenses against children. Juries may be predisposed to doubt the convicted sex offender and believe the State's expert. Thus, there is measurable value to a requirement that experts only introduce evidence that bears independent indicia of reliability and sufficient probative value." (emphasis added) (cleaned up)). V.
Having found error in the admission of Hyman's PPG results, we next consider whether that error warrants a new commitment trial. "A fundamental principle of appellate procedure is that a challenged decision must be both erroneous and prejudicial to warrant reversal." In re Care & Treatment of Gonzalez, 409 S.C. 621, 636, 763 S.E.2d 210, 217 (2014). A harmless error analysis is a fact-specific inquiry. State v. Byers, 392 S.C. 438, 447–48, 710 S.E.2d 55, 60 (2011). To that end, "No
26 Hyman noted this exact concern during the hearing on his motion in limine, asserting the jury would "grab onto the[] results from the PPG . . . to the exclusion of any other information and . . . convict him based on the PPG" alone. definite rule of law governs a finding of harmless error; rather[,] the materiality and prejudicial character of the error must be determined from its relationship to the entire case." Id. at 448, 710 S.E.2d at 60 (cleaned up) (quoting State v. Reeves, 301 S.C. 191, 193–94, 391 S.E.2d 241, 243 (1990)). An error is harmless when it could not have reasonably affected the result of the trial. Id. (quoting Reeves, 301 S.C. at 194, 391 S.E.2d at 243).
Here, the error in admitting Hyman's PPG results was clearly prejudicial. Dr. Gottfried's testimony regarding the PPG results had the appearance of highly persuasive "scientific" evidence. She described the PPG to the jury as "an objective physiological measure of male sexual arousal," "the gold standard of looking at males['] sexual arousal," and a "strong predictor or risk factor for future sexual offending." It is none of those things. In its quest to rescue the jury verdict based on harmless error, the State recasts the PPG as simply a small part of the evidence—a single data point among many— rather than the primary data point against Hyman. To the contrary, the State relied heavily on the PPG at all stages of the trial. For example, in responding to Hyman's motion for a directed verdict, the State discussed the PPG results at length to the exclusion of any other evidence when it claimed the PPG "proved" Hyman posed a current danger to the public. Similarly, the State informed the jury during closing arguments that the PPG results "clearly indicate[d] that [Hyman] has current sexual interest in children," and "that, in and of itself, is enough to put him in a secured facility for long term care[,] control and treatment." (Emphasis added).
Aside from the PPG, both experts agreed Hyman was—in Dr. Gottfried's words— "squarely within the average rate of recidivism" for sex offenders. (Emphasis added). Nonetheless, the jury deliberated for a mere twenty-two minutes before reaching a verdict. We see no other conclusion than the admission of the PPG results heavily influenced the outcome of the trial. We therefore decline to find the error in admitting the PPG results harmless. VI.
Exemplifying the challenge of precisely measuring a private and subjective human emotion, the PPG reflects decades' worth of effort to quantify sexual attraction and arousal. Despite the scientific community's long-standing labors, the reliability of PPG results is fatally undermined by the continuing lack of standardization in test administration and scoring across PPG laboratories. We therefore hold that unless and until the science underlying the PPG becomes more fully developed and uniform, PPG results are inadmissible in South Carolina under Rule 702, SCRE, and Council. Accordingly, we affirm the court of appeals' decision reversing and remanding the matter for a new SVP commitment trial.
AFFIRMED.
FEW, JAMES, HILL, and VERDIN, JJ., concur. Appendix
Below is a list of jurisdictions we have found which have either commented on the fact or directly held that the PPG is unreliable in part due to its lack of standardization. This list is by no means comprehensive, but it nonetheless serves to show the overwhelming weight of authority tends to support our holding today.
United States Court of United States v. Medina, 779 F.3d 55, 65, 71 (1st Cir. 2015) Appeals for the First ("The testing is controversial, both as to whether it is Circuit effective and as to whether it is unduly invasive and thus degrading. . . . [W]e acknowledged in each case the unusually invasive nature of such testing and the debate over its reliability. . . . We also remarked on the lack of evidence regarding [] 'the procedure's reliability' . . . ." (citations omitted)). United States Court of United States v. McLaurin, 731 F.3d 258, 263 (2nd Cir. Appeals for the 2013) ("The Government, however, cannot point to any Second Circuit consensus on the reliability of plethysmographic data.").
United States Court of United States v. Powers, 59 F.3d 1460,1471 & n.13 (4th Cir. Appeals for the 1995) ("[T]he scientific literature addressing penile Fourth Circuit plethysmography does not regard the test as a valid diagnostic tool because, although useful for treatment of sex offenders, it has no accepted standards in the scientific community. Powers has not provided, nor have we found, any decisions acknowledging the validity of the use of penile plethysmography other than in the treatment and monitoring of sex offenders.").
United States Court of United States v. Rhodes, 552 F.3d 624, 626 (7th Cir. 2009) Appeals for the ("Though the use of PPG is not uncommon, experts Seventh Circuit disagree as to its effectiveness.");
United States Court of United States v. Weber, 451 F.3d 552, 565 (9th Cir. 2006) Appeals for the Ninth ("Plethysmograph testing has also been sharply criticized as Circuit lacking 'uniform administration and scoring guidelines.' One researcher noted well over a dozen potential sources of variation among different assessments, including the type of measuring device and stimuli that are used, the characteristics of the test, and the setting in which it is conducted. The lack of standard procedures governing plethysmograph testing has led one pair of commentators to conclude that 'research data as well as individual findings derived by plethysmograph must be considered idiosyncratic [and] unamenable to normative comparisons, if not impossible to interpret from a traditional psychometric perspective.'" (cleaned up)). Rudy-Glanzer ex rel. Doe v. Glanzer, 232 F.3d 1258, 1266 (9th Cir. 2000) ("In fact, courts are uniform in their assertion that the results of penile plethysmographs are inadmissible as evidence because there are no accepted standards for this test in the scientific community. . . . Therefore, even though courts have accepted that penile plethysmographs can help in the treatment and monitoring of sex offenders, these tests have no indicia of reliability as evidence at trial.").
United States District United States v. White Horse, 177 F. Supp. 2d 973, 976 Court for the District (D.S.D. 2001) ("[T]he PPG is not accepted by the medical of South Dakota profession as a reliable or valid diagnostic tool. DSM–IV– TR, 569 (2000). The PPG is also not accepted by courts as admissible evidence.").
Alaska Galindo v. State, 481 P.3d 686, 691 (Alaska Ct. App. 2021) (noting the court had repeatedly vacated PPG testing as a condition of probation because of due process implications).
Nelson v. Jones, 781 P.2d 964, 967–68 (Alaska 1989) ("As the trial court remarked in discussing the penile plethysmograph, 'tests often give false results, both positive and negative.' Cases from other jurisdiction suggest that the court's doubts were well founded." (collecting cases); noting an expert "testified that the plethysmograph was useful in the treatment, but not in the assessment, of sexual deviance").
California In re Mark C., 8 Cal. Rptr. 2d 856, 863 (Ct. App. 1992) ("First, there is authority that the arousal test (penile plethysmograph) is not yet accepted in the scientific community as a reliable means of diagnosing sexual deviancy."). People v. John W., 229 Cal. Rptr. 783, 785 (Ct. App. 1986) ("Based upon these standards, there was clearly no acceptable showing in this case that the physical test was a reliable means of diagnosing sexual deviancy."). Georgia Gentry v. State, 443 S.E.2d 667, 669 (Ga. Ct. App. 1994) ("Testimony at trial revealed that reliability of the technique is by no means established. No scholarly works discussing it were introduced. No national guidelines have been adopted for its use. Given the rejection of penile plethysmograph evidence by other states, and particularly the uncertainty within the scientific community of its reliability, we hold that it is inadmissible in Georgia."). Maine Cooke v. Naylor, 573 A.2d 376, 379 (Me. 1990) ("Naylor's own expert testified that the test offers only 'probabilities and similarities to pedophilic profiles,' and other experts testified that these tests are of questionable reliability even in establishing a correct pedophilic or non-pedophilic profile for the individual undergoing evaluation. In the absence of any proof that this evidence was scientifically reliable or could be of benefit to the trier of fact, we find no error in the court's decision not to admit it."). Massachusetts Commonwealth v. Ortiz, 100 N.E.3d 790, 796, 797 (Mass. App. Ct. 2018) ("First, we discern no error in the judge's finding that, although the PPG appears to be commonly used as a tool in the treatment of sex offenders, it is not generally accepted in the clinical community for use in diagnosis. . . . For a test such as this, the stimuli used are by definition intrinsic to the result produced. With no standardized guidelines for either the content or even the mechanism of stimulus (audio or visual), the reliability of the procedure appears inherently dubious.").
Missouri In re Care & Treatment of Kirk, 520 S.W.3d 443, 462–63 (Mo. 2017) (en banc) (noting the expert who was advancing the reliability of the PPG testified "that the test suffered from a number of problems, including standardization, accuracy, reliability, and other issues," while the expert opposing admission of the PPG testified "that, to his knowledge, the PPG had not been standardized or cross- validated and the majority of professionals in his field would agree that the PPG should only be used for treatment"). New York Dutchess Cnty. Dep't of Soc. Servs. v. Mr. G., 534 N.Y.S.2d 64, 71 (Fam. Ct. 1988) ("Moreover, the results of the plethysmograph as a predictor of human behavior cannot be considered. The proof establishes that it is not only a device with, at best, questionable professional recognition, but one whose conceded margin of error is too great to reliably forecast [a child's] safety."). North Carolina State v. Spencer, 459 S.E.2d 812, 815–16 (N.C. Ct. App. 1995) ("[T]here is substantial disagreement as to the extent to which the penile response is subject to voluntary control and as to whether the penile response as measured by the plethysmograph can then be generalized to anything else pertaining to sexual behavior. . . . [T]he forensic validity of the instrument is highly suspect, and the utility of what the plethysmograph shows is highly questionable and the possibility of misleading the trier of fact or the jury is very high, dangerously high. We agree with the trial court that the evidence before it by no means established the reliability of the plethysmograph; there is a substantial difference of opinion within the scientific community regarding the plethysmograph's reliability to measure sexual deviancy. See e.g., Barker and Howell, The Plethysmograph: A Review of Recent Literature, 20 Bull. Am. Acad. of Psychiatry and Law 13 (1992) (identifying several problems with the reliability of the plethysmograph, namely 'lack of standards for training and interpretation of data, lack of norms and standardization and susceptibility of the data to false negatives and false positives,' and concluding that 'despite the sophistication of the current equipment technology, a question remains whether the information emitted is a valid and reliable means of assessing sexual preference'); see also, Myers, et al., Expert Testimony in Child Sexual Abuse Litigation, 68 Neb. L. Rev. 1, 134–35 (1989) (stating that a problem with the reliability of penile plethysmograph testing is that penile response is subject to voluntary control, and the test should not be used to determine whether or not an individual has engaged in deviant behavior). . . . In view of the lack of general acceptance of the plethysmograph's validity and utility and therefore, its reliability for forensic purposes in the scientific community in which it is employed, we hold that the trial court did not abuse its discretion in finding defendant's plethysmograph testing data insufficiently reliable to [be admitted]." (cleaned up)). South Carolina In re Care & Treatment of Bilton, 432 S.C. 157, 162, 851 S.E.2d 442, 444 (Ct. App. 2020) ("The test is controversial and has been criticized for a lack of standardization and for being subject to manipulation.").
Texas In re A.V., 849 S.W.2d 393, 399 (Tex. App. 1993) ("In the instant case, we have grave reservations whether appellee has established that the penile plethysmograph is a reliable test of a person's alleged sexual deviancy, and whether such a test is generally accepted in the scientific community as a valid indicator of sexual preferences or disorders. There was no medical testimony presented on the issue of reliability, and [the expert witness] himself admitted that a person of high intelligence who does some reading can easily manipulate the results of the penile plethysmograph. Based upon the record before us, we are unable to conclude that appellee has established the reliability of the penile plethysmograph."). Virginia Billips v. Commonwealth, 652 S.E.2d 99, 102 (Va. 2007) ("The record is devoid of any evidence of the reliability of plethysmograph testing . . . .").