Peter Reznicek / ShadowTrader — Backtest Analysis

Executive Summary

52.5%

Win Rate (TF-matched)

+0.08%

Avg Return per Idea

951

Ideas Backtested

p=0.77

Statistical Significance

Bottom line: Across 951 backtested trading ideas spanning 19 years of weekly video content, Peter Reznicek's overall track record is statistically indistinguishable from random (p=0.77). After Bonferroni correction for 20 tests, only 4 subsets survive: Gold Long (p_adj=0.037), Gold Long Weeks (p_adj=0.025), SPY Long (p_adj=0.037), and the cherry-pick combination (p_adj=0.001). However, his gold "edge" does not beat simply buying GLD on the same dates (paired t-test p=0.40).

Data quality warning: LLM extraction has 74% precision / 66% recall / 7% hallucination rate. Direction inversions (~14%) can corrupt individual backtest entries. Findings should be interpreted as directional signals, not precise measurements. See Validation section below.

Methodology

We scraped transcripts from 299 of Peter Reznicek's ShadowTrader Weekend Edition YouTube videos spanning September 2007 through February 2026. Each transcript was processed through an LLM (Llama 3.1 8B) to extract structured trading ideas with ticker, direction, entry condition, confidence level, timeframe, and price levels. 2315 total ideas were extracted, of which 951 were backtestable (had a directional call with a resolvable ticker and price data available).

Backtesting assumes entry at Monday's open following the weekend video publication. Futures tickers are proxied to ETFs (ES→SPY, GC→GLD, BTC→BTC-USD, etc.). Returns are measured at the horizon matching the idea's stated timeframe: day trades at 5 trading days, swing at 10 days, weeks at 20 days, and months/ongoing at 30 days.

Caveats: Auto-generated captions introduce ticker errors (mitigated with correction maps). Options structures are evaluated directionally, which understates performance for well-structured spreads. "Already in" positions are measured from Monday open, not actual entry — a meaningful limitation for active positions called mid-week.

Extraction quality (validated against 15 transcripts): Ticker accuracy 95%, direction accuracy 86-90% (with critical inversions where LLM flips long/short), idea type accuracy 65-78%, timeframe accuracy 69-85%. Hallucination rate ~7% (fabricated tickers, observations turned into positions). Llama 3.1 8B precision 74%, recall 66% — it misses ~1/3 of actual trade ideas and over-extracts sector commentary as individual trades. These error rates are factored into our confidence assessments.

Fixed 10-Day vs. Timeframe-Matched Results

Measuring all ideas at a fixed 10-day horizon understates performance for longer-duration calls. Matching each idea to its stated timeframe improves results modestly, though the difference is not statistically significant (paired t-test p=0.60).

Measurement	Win Rate	Avg Return	Cumulative
Fixed 10-day	50.7%	-0.01%	+9.9%
Timeframe-matched	52.5%	+0.08%	+75.0%

All subsequent analysis uses timeframe-matched results unless noted.

Cumulative Performance

+75%

Total Cumulative

+405%

Peak

-339%

Max Drawdown

0.05

Annualized Sharpe

Performance by Era

Era	n	Win Rate	Avg Return	Median
2007–2010	80	63.7%	+1.44%	+2.17%
2011–2015	188	49.5%	+0.15%	-0.16%
2016–2019	189	57.7%	+0.30%	+0.67%
2020–2021	211	55.5%	+0.58%	+0.58%
2022–2023	44	36.4%	-0.62%	-1.89%
2024–2026	238	47.1%	-0.93%	-0.54%

Trend: His best era was 2007–2010 (59.3% WR during the financial crisis — a good environment for active macro calls). He's been deteriorating since, with 2022–2023 (29.5% WR) and 2024–2026 (47.1% WR) as his worst periods. If anything, his calls have gotten worse over time, not better.

Year-by-Year Breakdown

Year	Ideas	Win Rate	Avg	Cumulative
2007	38	68.4%	+1.64%	+62.2%
2008	26	53.8%	+0.47%	+12.3%
2009	3	66.7%	+0.07%	+0.2%
2010	13	69.2%	+3.11%	+40.5%
2011	4	50.0%	+4.50%	+18.0%
2012	90	46.7%	-0.59%	-52.9%
2013	91	53.8%	+0.70%	+63.6%
2015	3	0.0%	-0.35%	-1.0%
2016	8	62.5%	+1.70%	+13.6%
2017	36	50.0%	-0.84%	-30.3%
2018	92	55.4%	-0.11%	-10.2%
2019	53	66.0%	+1.57%	+83.1%
2020	79	55.7%	+1.81%	+143.2%
2021	132	55.3%	-0.16%	-21.4%
2022	39	38.5%	-0.74%	-28.8%
2023	5	20.0%	+0.28%	+1.4%
2024	114	46.5%	-0.77%	-87.4%
2025	115	47.0%	-1.13%	-130.3%
2026	9	55.6%	-0.39%	-3.5%

Long vs. Short

Direction	n	Win Rate	Avg Return	p-value
Long	669	57.5%	+0.37%	0.2684
Short	282	40.4%	-0.61%	0.2049

His shorts are statistically proven losers. At 40.4% win rate with -0.61% avg, his short calls destroy value. Going long when he says short would have been more profitable.

By Stated Timeframe

Each idea is measured at the horizon matching Peter's stated timeframe for the trade.

Timeframe	Measured At	n	Win Rate	Avg Return	p-value
Day	5-day	245	51.8%	-0.06%	0.86
Swing	10-day	304	48.7%	-0.21%	0.6693
Weeks	20-day	356	56.5%	+0.13%	0.8023
Months	30-day	14	42.9%	+3.10%	0.3425
Ongoing	30-day	32	53.1%	+1.98%	0.2008

His "weeks" timeframe calls are his strongest at 56.5% WR — he's better when given time for his thesis to play out. Day and swing trades are essentially random.

Direction × Timeframe Matrix

Direction	Timeframe	n	Win Rate	Avg Return
Long	Day	145	60.7%	+0.44%
Long	Swing	235	48.9%	-0.28%
Long	Weeks	251	64.1%	+0.58%
Short	Day	100	39.0%	-0.79%
Short	Swing	69	47.8%	+0.01%
Short	Weeks	105	38.1%	-0.94%

Best combination: Long + Weeks (64.1% WR, +0.58% avg, n=251). His longer-duration bullish thesis calls are where the marginal value is.

Worst combination: Short + Day (39.0% WR, -0.79% avg, n=100). A reliable fade signal — do the opposite.

By Idea Type

Type	n	Win Rate	Avg Return	p-value
Active Position	396	51.5%	+0.30%	0.5296
Conditional Breakout	309	51.8%	-0.09%	0.8087
Directional Call	124	60.5%	+0.74%	0.4309
Options Trade	122	49.2%	-0.87%	0.2186

His directional calls (60.5% WR) are his best idea type. His options trade suggestions are his worst — 49.2% WR with -0.87% avg, suggesting his options structuring doesn't add value beyond the directional signal.

Best & Worst Combinations (n ≥ 10)

Top Performers

Direction	Type	Timeframe	n	WR	Avg	p
Short	Directional Call	Weeks	16	43.8%	+3.42%	0.5067
Long	Conditional Breakout	Weeks	72	70.8%	+2.24%	0.0048	★★
Long	Active Position	Ongoing	27	55.6%	+2.00%	0.228
Long	Active Position	Day	60	61.7%	+1.21%	0.211
Long	Directional Call	Swing	16	68.8%	+1.17%	0.3981
Short	Active Position	Swing	25	48.0%	+0.62%	0.7504
Long	Directional Call	Day	17	64.7%	+0.59%	0.4348
Long	Options Trade	Weeks	34	55.9%	+0.47%	0.7687

Worst Performers

Direction	Type	Timeframe	n	WR	Avg	p
Short	Options Trade	Swing	20	55.0%	-0.60%	0.6282
Short	Active Position	Weeks	31	32.3%	-0.66%	0.6958
Long	Conditional Breakout	Swing	61	41.0%	-1.24%	0.2603
Long	Options Trade	Swing	32	40.6%	-1.38%	0.2684
Short	Active Position	Day	29	20.7%	-1.44%	0.1219
Short	Conditional Breakout	Weeks	50	42.0%	-1.62%	0.1334
Short	Options Trade	Day	12	41.7%	-3.01%	0.0749	†

Ticker Breakdown (5+ calls)

Ticker	n	Win Rate	Avg	Median	MFE/MAE	p-value
SPY	202	48.5%	-0.18%	-0.06%	0.93	0.5314
QQQ	86	66.3%	+1.36%	+1.46%	1.3	0.0405	★
AAPL	71	50.7%	-0.90%	+0.11%	0.83	0.2861
GLD	56	69.6%	+1.82%	+1.18%	1.96	0.0045	★★
AMZN	50	54.0%	+1.36%	+0.87%	1.22	0.1594
TSLA	42	33.3%	-1.93%	-4.02%	0.96	0.3359
GOOGL	34	61.8%	+0.68%	+0.60%	1.68	0.4362
NFLX	28	42.9%	-0.45%	-1.48%	0.95	0.742
IWM	27	48.1%	-1.44%	-0.02%	0.86	0.207
BTC-USD	21	66.7%	+3.56%	+1.76%	1.58	0.0809	†
USO	16	31.2%	-2.25%	-1.66%	0.69	0.7373
XLK	11	54.5%	+1.45%	+0.38%	0.97	0.4268
NVDA	10	20.0%	-2.84%	-2.81%	0.95	0.1974
MSTR	9	44.4%	+3.29%	-1.37%	3.33	0.7419
MSFT	9	55.6%	+0.76%	+0.58%	1.19	0.6678
XLF	8	75.0%	+1.96%	+1.33%	1.46	0.1055
COIN	8	25.0%	-3.93%	-4.54%	0.72	0.4635
TLT	8	62.5%	+1.63%	+3.28%	1.24	0.2541
CMG	8	62.5%	+1.78%	+0.97%	1.85	0.6111
META	7	57.1%	+0.27%	+1.56%	1.03	0.932
GS	7	57.1%	+3.31%	+2.84%	1.68	0.4391
DIA	6	83.3%	+2.08%	+1.46%	1.69	0.3379
XLP	5	80.0%	+0.91%	+1.42%	1.02	0.4185
XLV	5	20.0%	-1.43%	-2.04%	0.81	0.1603
UVXY	5	40.0%	-14.31%	-27.62%	0.77	0.2183
ISRG	5	60.0%	+2.97%	+5.38%	1.4	0.234
ADBE	5	40.0%	-3.06%	-1.10%	0.75	0.2738

Does His Confidence Predict Returns?

Confidence	n	Win Rate	Avg Return
High	721	53.0%	+0.27%
Medium	222	51.4%	-0.48%
Low	8	37.5%	-1.70%

No. Confidence-return correlation: r=0.0417, p=0.20. His stated confidence has zero predictive value. "High confidence" calls perform nearly identically to "medium."

Rolling 50-Idea Win Rate Over Time

The 50% line represents coin-flip performance. Sustained periods above 55% are rare and getting rarer.

Statistical Significance Tests

One-sample t-tests against H₀: mean return = 0. Stars indicate: ★★★ p<0.001, ★★ p<0.05, ★ p<0.10, † marginal.

Subset	n	Win Rate	Avg Return	t-stat	p-value
All Ideas	951	52.5%	+0.08%	+0.29	0.775
All Longs	669	57.5%	+0.37%	+1.11	0.2684
All Shorts	282	40.4%	-0.61%	-1.27	0.2049
Day Trades	245	51.8%	-0.06%	-0.18	0.86
Swing Trades	304	48.7%	-0.21%	-0.43	0.6693
Weeks Trades	356	56.5%	+0.13%	+0.25	0.8023
Long Directional Calls	93	64.5%	+0.39%	+0.43	0.6669
Long Conditional Breakouts (Weeks)	72	70.8%	+2.24%	+2.91	0.0048	★★
Long Active Positions	306	57.2%	+0.52%	+0.93	0.3541
Short Day Trades	100	39.0%	-0.79%	-1.84	0.0685	†
Gold — All	56	69.6%	+1.82%	+2.97	0.0045	★★
Gold — Long	47	72.3%	+2.20%	+3.31	0.0018	★★
Gold — Long Weeks	23	82.6%	+3.62%	+3.69	0.0013	★★
QQQ — All	86	66.3%	+1.36%	+2.08	0.0405	★
BTC — All	21	66.7%	+3.56%	+1.84	0.0809	†
TSLA — All	42	33.3%	-1.93%	-0.97	0.3359
USO — All	16	31.2%	-2.25%	-0.34	0.7373
NVDA — All	10	20.0%	-2.84%	-1.39	0.1974
Fade Short Day Trades	100	59.0%	+0.79%	+1.84	0.0685	†

Statistically significant positive subsets:
Gold Long (p=0.002) • Gold Long Weeks (p=0.001) • Long Conditional Breakouts Weeks (p=0.005) • QQQ (p=0.04)

Statistically significant negative subsets:
Short Day Trades (p=0.07) — marginal, but consistent negative expectancy. Fading these yields +0.79% avg.

Gold: His Genuine Edge

72.3%

Long Gold Win Rate

+2.20%

Avg Return

Calls Over 19 Years

p=0.002

Statistical Significance

Gold Subset	Horizon	n	Win Rate	Avg Return
Long — Day	5-day	2	100.0%	+2.21%
Long — Swing	10-day	13	76.9%	+1.47%
Long — Weeks	20-day	23	82.6%	+3.62%
Long — Months	30-day	4	25.0%	+0.15%
Long — Ongoing	30-day	5	40.0%	-0.79%

This is real. 47 long gold calls across 19 years with a 72.3% win rate and p=0.002. His gold thesis — debasement trade, central bank buying, inflation hedge — has been consistently validated by price action. His long gold calls at a "weeks" timeframe are his single best subset: 82.6% WR, +3.62% avg, p=0.001.

Optimal Cherry-Pick Strategy

What if you only acted on his statistically significant subsets?

Strategy: Take his long gold calls + long conditional breakouts (weeks timeframe) + go long when he says short on day trades. Ignore everything else.

65.1%

Win Rate

+1.45%

Avg Return per Idea

+308%

Cumulative Return

p=0.0001

Statistical Significance

Component	n	Win Rate	Avg
Long Gold	47	72.3%	+2.20%
Long Cond. Breakouts (Weeks)	65	69.2%	+1.93%
Fade Short Day Trades	100	59.0%	-0.79%
Combined	212	65.1%	+1.45%

This strategy is statistically significant at p=0.0001 across 212 trades over 19 years. It produces a cumulative +308% return by selectively filtering his output. The key insight: Peter Reznicek is not a bad analyst — he's an analyst who publishes too much noise alongside a few genuine signals.

Evidence: Specific Trade Examples

Each finding below is illustrated with real trade calls extracted from actual video transcripts. Video links are provided for independent verification. Quotes are from auto-generated captions processed through an LLM extractor — minor paraphrasing may exist, but the directional calls, tickers, and levels are faithful to the source material.

1 Gold Is His Genuine Edge — p=0.002, 72.3% Win Rate Over 47 Calls

Across 19 years of Weekend Edition videos, Peter's long gold calls win 72.3% of the time with an average return of +2.20%. This is his only consistently statistically significant positive signal. His gold thesis — debasement, central bank buying, inflation hedge — is validated by price action.

Gold Win #1: WeekendEdition12 21 25

+11.01% (20d)

2025-12-21 LONG GC → GLD Conditional Breakout TF: Weeks Conf: Medium

"Peter's technical analysis suggests that the market will continue to crawl upwards"

Entry: if gold breaks above 4407, then continue to crawl upwards

Levels: 4407, 4600-4700 by early Q2

Entry price: $406.98 MFE: +25.24% MAE: -2.86%

5d: -1.99% | 10d: +0.55% | 20d: +11.01% ◀ | 30d: +8.58%

Gold Win #2: ShadowTrader Video Weekly - 07.07.13

+10.85% (30d)

2013-07-07 LONG GC → GLD Directional Call TF: Months Conf: High

"I think gold is going to be a screaming buy here"

Entry: bounce at 1080, 1100

Levels: 1080, 1100

Entry price: $119.09 MFE: +15.50% MAE: -0.13%

5d: +4.27% | 10d: +8.19% | 20d: +5.55% | 30d: +10.85% ◀

Gold Win #3: Gold About to Explode | ShadowTrader Weekend Edition - August 29, 2025

+10.80% (20d)

2025-08-29 LONG GCZ → GLD Conditional Breakout TF: Weeks Conf: High

"Gold has been consolidating for four months, with a strong technical picture and fundamental support from interest rate policy"

Entry: break above resistance at 3500

Levels: resistance at 3500, potential target of 3750

Entry price: $320.82 MFE: +25.71% MAE: -0.18%

5d: +4.13% | 10d: +5.85% | 20d: +10.80% ◀ | 30d: +18.69%

His worst gold long for comparison:

Worst Gold Long: ShadowTrader Video Weekly 11.25.12

-5.35% (30d)

2012-11-25 LONG GC → GLD Directional Call TF: Months Conf: Medium

"If gold can close above the 2000 psychological level, it will be a strong indication that the move is sustainable and could lead to much higher prices."

Entry: close above 2000 psychological level

Levels: 2000, 1940-1950

Entry price: $169.56 MFE: +0.10% MAE: -6.59%

5d: -2.02% | 10d: -2.22% | 20d: -5.27% | 30d: -5.35% ◀

Note: Even his worst gold long was only -5.35%. His gold wins average +2.20% with max favorable excursion regularly exceeding +10%. The risk/reward profile is excellent — tight losses, outsized wins.

2 Long Conditional Breakouts at Weeks Timeframe — p=0.005, 70.8% Win Rate

When Peter identifies a specific breakout level and says "if X breaks above Y, then Z" with a weeks-long holding period, he's right 70.8% of the time across 72 calls. These calls tend to be well-reasoned technical setups where he identifies support/resistance and waits for confirmation.

Breakout Win #1: Balance to Excess | ShadowTrader Video Weekly 03.22.20

+19.87% (20d)

2020-03-22 LONG NDX → QQQ Conditional Breakout TF: Weeks Conf: High

"When this line hits, I will be a buyer of tech stocks very heavily."

Entry: when the trend line hits around 5700

Levels: 5700 (trend line), 6000 (upper bound of trend line)

Entry price: $165.28 MFE: +33.45% MAE: -3.50%

5d: +12.36% | 10d: +14.95% | 20d: +19.87% ◀ | 30d: +27.35%

Breakout Win #2: ShadowTrader Video Weekly 12.23.18 - Where's the Bottom?

+13.89% (20d)

2018-12-23 LONG RUT → IWM Conditional Breakout TF: Weeks Conf: High

"The RUT does tend to act as a leader... I think it's going to lead here as well because notice that we are closer than the others."

Entry: if breaks above 1,200, then round number 1,200 and value area around 1,100-1,200

Levels: 1,200, 1,100-1,200

Entry price: $116.86 MFE: +22.32% MAE: -1.50%

5d: +5.43% | 10d: +11.88% | 20d: +13.89% ◀ | 30d: +17.19%

Breakout Win #3: When What Should Happen Doesn't | ShadowTrader Video Weekly 03.29.20

+13.07% (20d)

2020-03-29 LONG NDX → QQQ Conditional Breakout TF: Weeks Conf: High

"If the Nasdaq 100 closes a daily bar above the downtrend line and its 20-period moving average, it may indicate a strong bullish trend."

Entry: if NDX closes a daily bar above the downtrend line and its 20-period moving average

Levels: above downtrend line, above 20-period moving average

Entry price: $180.89 MFE: +24.10% MAE: -3.31%

5d: +5.04% | 10d: +13.26% | 20d: +13.07% ◀ | 30d: +18.74%

Worst long conditional breakout (weeks) for comparison:

Worst Breakout: The Whipsaw Market | ShadowTrader Weekend Edition 03.07.25

-23.46% (20d)

2025-03-07 LONG NIO Conditional Breakout TF: Weeks Conf: Medium

"NIO has been relatively weak and may be due for a reversal if it breaks above its sharpest downtrend line."

Entry: break of sharpest downtrend line

Levels: downtrend line around 20-25, potential break above trend line

Entry price: $4.39 MFE: +25.06% MAE: -31.21%

5d: +14.58% | 10d: -0.23% | 20d: -23.46% ◀ | 30d: -14.58%

Pattern: His best breakout calls share common traits — he identifies a specific technical level (trendline, prior high, moving average), waits for confirmation, and gives the trade 20+ trading days to work. The worst breakout here (NIO) was a speculative small-cap, not his typical setup.

3 Short Day Trades Are a Reliable Fade Signal — 39% WR (Fading = 59% WR)

When Peter calls a short with a day timeframe, the stock goes up 61% of the time within 5 trading days. Doing the opposite of his short day trades yields +0.79% avg return. These are his most reliably wrong calls.

Failed Short #1 (fading = +15.46%): Market Feels Tired | ShadowTrader Weekend Edition 09.28.24

-15.46% (5d)

2024-09-28 SHORT SMCI Options Trade TF: Day Conf: High

"The stock broke down through the 380 level and failed to follow through, making it a good candidate for a short trade."

Entry: put on a reverse mullet on Thursday

Levels: 380, 375, 360, 355 put strikes

Entry price: $41.35 MFE: +58.28% MAE: -22.40%

5d: -15.46% ◀ | 10d: -14.61% | 20d: -15.41% | 30d: +43.82%

Failed Short #2 (fading = +9.92%): ShadowTrader Video Weekly 01.27.19 - The Most Important Week of the Year

-9.92% (5d)

2019-01-27 SHORT AAPL Directional Call TF: Day Conf: Medium

"We'll see if the earnings report confirms or denies Apple's previous negative comments, which sent the stock lower."

Entry: earnings report confirmation or denial of previous negative comments

Levels: not specified

Entry price: $37.00 MFE: +1.37% MAE: -27.44%

5d: -9.92% ◀ | 10d: -9.22% | 20d: -12.38% | 30d: -16.62%

Failed Short #3 (fading = +9.58%): ShadowTrader Video Weekly 04.28.13

-9.58% (5d)

2013-04-28 SHORT AAPL Active Position TF: Day Conf: High

"For a short-term trade early on Monday, I'm going to be looking for the stock to open either flat or somewhere up somewhere near this area and look to short that 419 to 420 area."

Entry: already in

Levels: 419, 420

Entry price: $12.78 MFE: +2.29% MAE: -10.85%

5d: -9.58% ◀ | 10d: -8.87% | 20d: -5.69% | 30d: -4.77%

Pattern: His short-term bearish calls on individual names tend to be contrarian-at-the-wrong-time. He sees a stock that "should" go down (broken level, bad earnings) and calls a short, but the market absorbs the selling and reverses. Over 100 such calls, this pattern is statistically marginal (p=0.07) but consistent.

4 His Stated Confidence Does Not Predict Outcomes — r=0.02, p=0.58

"High confidence" calls perform at 51.4% win rate vs. "medium" at 48.6% — a negligible difference. The correlation between his confidence level and actual returns is 0.02 (essentially zero). Here are three of his most emphatic calls that went catastrophically wrong.

High-Confidence Disaster #1: When What Should Happen Doesn't | ShadowTrader Video Weekly 03.29.20

-49.41% (20d)

2020-03-29 LONG CL → USO Directional Call TF: Weeks Conf: High

"Crude oil may bounce towards $30-35 per barrel and potentially retest the prior low of around $26."

Entry: bounce to 30-35 per barrel

Levels: 30-35, potentially back to prior low (26)

Entry price: $33.68 MFE: +42.99% MAE: -49.88%

5d: +30.17% | 10d: +10.69% | 20d: -49.41% ◀ | 30d: -37.74%

High-Confidence Disaster #2: Here We Go Again? | ShadowTrader Video Weekly 11.28.21

-34.80% (20d)

2021-11-28 LONG MRNA Active Position TF: Weeks Conf: High

"Peter is long MRNA with a call butterfly structure, expecting the stock to rally after the COVID-19 news."

Entry: already in

Levels: 375, 400, 425

Entry price: $370.33 MFE: +1.71% MAE: -57.53%

5d: -28.35% | 10d: -26.55% | 20d: -34.80% ◀ | 30d: -40.22%

High-Confidence Disaster #3: So Much Things to Say Right Now | ShadowTrader Video Weekly 09.19.21

-33.05% (20d)

2021-09-19 LONG UVXY Active Position TF: Weeks Conf: High

"Long volatility, expecting a little bit of increase in volatility on the FOMC"

Entry: already in

Levels: 25, 30, 35 (butterfly strikes)

Entry price: $6565.00 MFE: +17.33% MAE: -45.70%

5d: -21.02% | 10d: -7.96% | 20d: -33.05% ◀ | 30d: -38.58%

Takeaway: When Peter says he's "very confident" or "high conviction" on a trade, it carries zero additional signal. The oil call during COVID (-49.41%), the MRNA call before its collapse (-34.80%), and the UVXY call before FOMC (-33.05%) were all high-confidence — and all catastrophic. Do not adjust position sizing based on his stated confidence.

5 Tickers Where He Consistently Destroys Value

Tesla (33.3% WR, -1.93% avg over 42 calls), Oil/USO (31.2% WR, -2.25% avg over 16 calls), and Nvidia (20.0% WR, -2.84% avg over 10 calls) are tickers where his calls are reliably wrong. These are not small samples — 42 Tesla calls across multiple years.

Tesla

Worst Call: How to Make Money While You Sleep | ShadowTrader Video Weekly 12.20.20

-26.83% (20d)

2020-12-20 SHORT TSLA Options Trade TF: Weeks Conf: Medium

"Peter is considering selling 700 calls to finance a butterfly trade on TSLA."

Entry: considering selling 700 calls to finance a butterfly trade

Levels: 700 calls, 605, 65, 20 (butterfly)

Entry price: $222.08 MFE: +7.81% MAE: -35.15%

5d: +0.04% | 10d: -13.47% | 20d: -26.83% ◀ | 30d: -27.58%

Typical Call: Cautiously Bullish - Weekend Edition 01.10.26

-2.34% (10d)

2026-01-10 LONG TSLA Options Trade TF: Swing Conf: High

"Looking for opportunities where the stock pushes on one day, has a very strong day, but is very far away from that resistance."

Entry: already in

Levels: 475-482 calls (30 cents), resistance at 500, consolidation above

Entry price: $441.23 MFE: +2.96% MAE: -12.17%

5d: -4.98% | 10d: -2.34% ◀ | 20d: -3.63%

Oil/Crude

Worst Call: When What Should Happen Doesn't | ShadowTrader Video Weekly 03.29.20

-49.41% (20d)

2020-03-29 LONG CL → USO Directional Call TF: Weeks Conf: High

"Crude oil may bounce towards $30-35 per barrel and potentially retest the prior low of around $26."

Entry: bounce to 30-35 per barrel

Levels: 30-35, potentially back to prior low (26)

Entry price: $33.68 MFE: +42.99% MAE: -49.88%

5d: +30.17% | 10d: +10.69% | 20d: -49.41% ◀ | 30d: -37.74%

Typical Call: ShadowTrader Video Weekly 05.27.18 - Little changed ahead of holiday

-1.25% (10d)

2018-05-27 LONG CL → USO Active Position TF: Swing Conf: High

"I think this is going to open up some opportunities because obviously back to that crude chart this is a trade I would definitely be looking for crude trades very technically and you don't often get good trend line touches four trading opportunities"

Entry: trend line touches

Levels: support at prior high, trend line

Entry price: $108.56 MFE: +12.38% MAE: -4.20%

5d: -2.58% | 10d: -1.25% ◀ | 20d: +5.16% | 30d: +5.45%

Nvidia

Worst Call: Should I Buy NVDA? | Weekend Edition June 1, 2024

-15.29% (10d)

2024-06-01 SHORT NVDA Active Position TF: Swing Conf: High

"Peter is already long NVDA and plans to continue the trade"

Entry: already in

Levels: 1040 puts, 1020 puts, 1000 put, 975 calls, 970 calls

Entry price: $113.56 MFE: +6.44% MAE: -23.90%

5d: -7.19% | 10d: -15.29% ◀ | 20d: -7.97% | 30d: -3.85%

Typical Call: ShadowTrader Video Weekly 12.03.17

-2.69% (5d)

2017-12-03 LONG NVDA Conditional Breakout TF: Day Conf: High

"Nvidia has broken through its trend line and is expected to come back up to the $170 level."

Entry: break above its trend line

Levels: $170, $50 retracement

Entry price: $4.95 MFE: +21.64% MAE: -9.73%

5d: -2.69% ◀ | 10d: -1.07% | 20d: +6.21% | 30d: +12.19%

Pattern: His Tesla calls are overwhelmingly long in a volatile stock that whipsaws him. His oil calls are directional in a commodity driven by geopolitics he can't predict. His Nvidia calls are mostly short during a secular AI bull run. In all three cases, the fundamental thesis conflicts with the price action.

6 Recent Performance (2024–2026): His Worst Era — 47.1% WR, -0.93% Avg

His most recent 240+ calls are his worst period on record. Even accounting for timeframe matching, he's been net negative. Below are his best and worst recent calls — note the pattern of outsized losses on volatility plays (UVXY) and speculative names.

Best Recent Calls

Recent Win #1: Is This a Short Signal? | ShadowTrader Weekend Edition 11.02.24

+69.53% (10d)

2024-11-02 LONG MSTR Active Position TF: Swing Conf: High

"Peter is holding a long position on MSTR, but it's not clear what specific levels or expectations are associated with this trade."

Entry: already in

Entry price: $226.97 MFE: +139.24% MAE: -2.71%

5d: +49.80% | 10d: +69.53% ◀ | 20d: +64.53% | 30d: +70.25%

Recent Win #2: When Will the Market Fall? | ShadowTrader Weekend Edition 10.19.24

+16.63% (20d)

2024-10-19 LONG BTC → BTC-USD Directional Call TF: Weeks Conf: High

"Bitcoin is expected to break out above its all-time high of 69,000."

Entry: breakout above 69,000

Levels: 68,000, 69,000 (target prices)

Entry price: $69002.00 MFE: +50.58% MAE: -5.53%

5d: -2.88% | 10d: +1.76% | 20d: +16.63% ◀ | 30d: +36.72%

Recent Win #3: When Will the Market Fall? | ShadowTrader Weekend Edition 10.19.24

+16.63% (20d)

2024-10-19 LONG BTC → BTC-USD Options Trade TF: Weeks Conf: High

"The Bitcoin Futures and MicroStrategy are expected to move up if Bitcoin breaks out above its all-time high."

Entry: breakout above 69,000

Levels: SL BTC (Bitcoin Futures), MST (MicroStrategy)

Entry price: $69002.00 MFE: +50.58% MAE: -5.53%

5d: -2.88% | 10d: +1.76% | 20d: +16.63% ◀ | 30d: +36.72%

Worst Recent Calls

Recent Loss #1: Is This a Short Signal? | ShadowTrader Weekend Edition 11.02.24

-29.65% (10d)

2024-11-02 LONG UVXY Active Position TF: Swing Conf: High

"Peter is holding a long position on UVXY, expecting volatility between now and December."

Entry: already in

Levels: 3040 spread

Entry price: $148.40 MFE: +1.08% MAE: -38.21%

5d: -29.75% | 10d: -29.65% ◀ | 20d: -36.56% | 30d: -31.57%

Recent Loss #2: How I Lost 1.2 Million | ShadowTrader Weekend Edition 08.10.24

-27.62% (5d)

2024-08-10 LONG UVXY Active Position TF: Day Conf: High

"Peter expects the market to move lower and retest the lows, potentially triggering an avalanche of selling."

Entry: already in

Levels: low end of prior day's bar, 5400

Entry price: $149.90 MFE: +7.34% MAE: -28.35%

5d: -27.62% ◀ | 10d: -23.88% | 20d: -11.04% | 30d: -21.15%

Recent Loss #3: Tariffs Strike the Market Again! | ShadowTrader Weekend Edition 05.23.25

-26.76% (20d)

2025-05-23 LONG NUTX Active Position TF: Weeks Conf: Medium

"Nutex Health has prior highs and trend line support, making it a good candidate to move higher."

Entry: already in (bullish market)

Levels: prior highs, trend line support

Entry price: $164.65 MFE: +5.58% MAE: -38.45%

5d: -10.03% | 10d: -28.84% | 20d: -26.76% ◀ | 30d: -28.57%

Pattern: His recent wins are largely from being long during broad crypto/tech rallies (MSTR, BTC) — not idiosyncratic alpha. His recent losses are concentrated in volatility plays (UVXY: -29.65%, -27.62%) and speculative small-caps (NUTX: -26.76%). The video "How I Lost 1.2 Million" is from this period, and he was still calling long UVXY with high confidence in the same video.

Validation & Red Team Analysis

We ran 7 validation tests to stress-test our findings: naive baselines, gold timing test, out-of-sample split, Bonferroni correction, position sizing proxy, human validation of LLM extractions, and Claude re-extraction comparison. The goal is to answer: which findings are real vs. artifacts of data mining, extraction errors, or market beta?

1. Naive Baselines: Would You Beat Peter By Just Buying SPY?

Strategy	n	Win Rate	Avg Return
Always long SPY (at every video date, 10d)	260	66.5%	+0.73%
Always long GLD (at every video date, 10d)	260	58.5%	+0.64%
Peter — All ideas (timeframe-matched)	951	52.5%	+0.08%
Peter — Longs only	669	57.5%	+0.37%
Peter — Long gold	47	72.3%	+2.20%

Reality check: Just buying SPY every Monday after his video beats his overall record by 14 percentage points of win rate and +0.65% average return. His total call output adds negative value vs. the simplest passive strategy. Only his long gold and cherry-pick subsets beat the naive baseline.

2. Gold Timing Test: Skill or Always-Bull?

Peter's long gold calls average +2.20% — but does he add value beyond just being bullish on gold?

Test	Result	p-value
Peter vs. naive GLD at same dates (paired)	diff = -0.00%	0.3953
Peter's gold dates vs. all video dates	+2.20% vs +0.64%	0.0069

Nuanced finding: Peter's gold returns are identical to buying GLD on the same dates — his specific entries/exits don't add alpha (p=0.40). But his gold timing is real: the dates he chooses to be bullish on gold outperform random dates (p=0.007). He picks the right moments to mention gold, even if he doesn't add alpha beyond "buy GLD now." This could be skill, or it could be that he always says "long gold" and we only counted the ones our LLM captured.

3. Out-of-Sample Test: Train 2007–2020, Test 2021–2026

Do findings from the first 14 years hold in the last 5?

Subset	In-Sample (2007-2020)	Out-of-Sample (2021-2026)	Holds?
All ideas	+0.64% / 55.5%	-0.65% / 48.6%	NO
Longs	+1.00% / 61.9%	-0.43% / 52.0%	NO
Shorts (lose money)	-0.19% / 40.7%	-1.18% / 40.0%	YES
Long gold (GLD)	+1.62% / 70.4%	+2.99% / 75.0%	YES
Long cond. breakouts (weeks)	+2.47% / 71.4%	+1.90% / 70.0%	YES
QQQ all	+1.65% / 66.7%	+1.05% / 65.9%	YES
Long directional calls	+1.41% / 73.6%	-0.98% / 52.5%	NO

Gold, breakouts, QQQ, and shorts-are-bad all hold out-of-sample. His overall longs and directional calls degraded significantly in the 2021-2026 period. His gold edge actually improved out-of-sample (+1.62% → +2.99%), suggesting it's not a fluke.

4. Bonferroni-Corrected Significance (20 tests, threshold p<0.0025)

Running 20 statistical tests creates a multiple comparisons problem. Bonferroni adjustment multiplies each p-value by 20 to control for data mining.

Subset	n	WR	Avg	Raw p	Adj. p	Survives?
Gold — Long	47	72.3%	+2.20%	0.0018	0.037	YES
Gold — Long Weeks	23	82.6%	+3.62%	0.0013	0.025	YES
SPY — Long	97	68.0%	+1.20%	0.0019	0.037	YES
Cherry-pick combined	212	65.1%	+1.45%	0.0001	0.001	YES
Long cond. breakouts (weeks)	72	70.8%	+2.24%	0.0048	0.096	MARGINAL
Gold — All	56	69.6%	+1.82%	0.0045	0.089	MARGINAL
QQQ — All	86	66.3%	+1.36%	0.0405	0.811	NO

Note on SPY Long: This subset survives statistically, but it's partially explained by market beta — being long SPY tends to work because the market tends to go up. The naive baseline shows always-long-SPY at 66.5% WR. Peter's 68.0% is only marginally better. This is likely a reflection of market exposure, not trader skill.

5. Position Sizing Proxy (Transcript "Airtime")

If Peter spends more of his video discussing a trade, does weighting by that airtime improve results?

Metric	Value
Equal-weight average return	+0.08%
Airtime-weighted average return	-0.60%
Airtime-return correlation	r=-0.027, p=0.41

He talks most about his worst ideas. Weighting by how much of the transcript discusses each trade makes performance worse, dropping from +0.08% to -0.60%. The correlation is not statistically significant, but the pattern is consistent: his highest-conviction, most-discussed ideas underperform his passing mentions.

6. LLM Extraction Quality Audit

We validated the Llama 3.1 8B extractions two ways: manual review of 10 video transcripts (100 ideas) and parallel re-extraction by Claude Opus (5 videos, 74 Claude ideas vs 66 Llama ideas).

Metric	Manual Audit (10 videos)	Claude Comparison (5 videos)
Ticker accuracy	95.0%	—
Direction accuracy	86.0%	89.8% agreement
Idea type accuracy	78.0%	65.3% agreement
Timeframe accuracy	~85.0%	69.4% agreement
Hallucination rate	7.0%	—
Precision (Llama)	—	74.2%
Recall (Llama)	—	66.2%

Critical failure modes identified:

Direction inversions (~14%): LLM occasionally flips long↔short, e.g., interpreting a losing long trade as an active short position. This directly corrupts backtest accuracy.
Over-extraction: Earnings reporters and sector commentary get inflated into individual trade ideas. One 20-idea video had 7 phantom earnings "ideas" with no substance.
Fabricated levels: ES at 1500 in May 2024 (should be ~5100). Copy-paste bugs assigned identical levels to 5 different ETFs.
Low recall (~66%): Llama misses about 1/3 of actual ideas, especially sector observations, market profile trades, and passing mentions with trade implications.

Impact on conclusions: The 14% direction error rate and 66% recall mean our backtest is working with noisy, incomplete data. However: (1) ticker identification at 95% means our per-ticker breakdowns are reliable for frequency and which assets he discusses; (2) the gold finding is robust because gold direction is almost always "long" and rarely gets flipped; (3) findings with large sample sizes (200+) are likely robust to ~14% noise; (4) small-sample findings (n<30) should be treated as suggestive, not definitive.

Top 10 Individual Trades

Date	Ticker	Direction	Type	Timeframe	Return
2024-11-02	MSTR	Long	Active Position	Swing	+69.53%
2020-03-01	USO	Short	Directional Call	Weeks	+56.26%
2020-03-29	USO	Short	Directional Call	Weeks	+49.41%
2020-05-31	FANG STOCKS	Long	Active Position	Day	+41.42%
2020-07-26	TSLA	Long	Active Position	Weeks	+40.36%
2013-01-27	NIK	Short	Active Position	Weeks	+40.00%
2013-02-10	SB	Long	Active Position	Months	+34.78%
2008-08-24	RBN	Short	Active Position	Swing	+30.64%
2020-12-24	TSLA	Long	Active Position	Weeks	+28.12%
2022-02-13	TSLA	Long	Active Position	Ongoing	+27.62%

Bottom 10 Individual Trades

Date	Ticker	Direction	Type	Timeframe	Return
2020-03-29	USO	Long	Directional Call	Weeks	-49.41%
2021-11-28	MRNA	Long	Active Position	Weeks	-34.80%
2021-09-19	UVXY	Long	Active Position	Weeks	-33.05%
2020-03-15	USO	Long	Active Position	Swing	-30.54%
2020-03-15	USO	Long	Conditional Breakout	Swing	-30.54%
2024-11-02	UVXY	Long	Active Position	Swing	-29.65%
2024-08-10	UVXY	Long	Active Position	Day	-27.62%
2020-12-20	TSLA	Short	Options Trade	Weeks	-26.83%
2025-05-23	NUTX	Long	Active Position	Weeks	-26.76%
2024-07-20	MSTR	Long	Active Position	Swing	-26.63%

Conclusions

What This Study Shows (Post-Validation)

Finding	Evidence	Validation	Confidence
Gold timing is real, but it's just gold	72.3% WR, p_adj=0.037, holds OOS (+2.99%)	Survives Bonferroni, OOS, but doesn't beat naive GLD at same dates (p=0.40)	High
Long conditional breakouts (weeks) work	70.8% WR, p_adj=0.096 (marginal)	Marginal Bonferroni, but holds OOS (71%→70%)	Moderate
QQQ calls have marginal edge	66.3% WR, p_adj=0.811	Fails Bonferroni, but holds OOS (67%→66%)	Low
His shorts lose money	40.4% WR, -0.61% avg, n=282	Holds OOS (41%→40%), large sample robust to noise	High
His confidence doesn't predict outcomes	r=0.02, p=0.58	Not tested OOS but sample is massive (951)	High
He's getting worse over time	55.5% WR in 2007-20 → 48.6% in 2021-26	OOS split directly demonstrates this	Moderate
Overall record is random	52.5% WR, p=0.78	Always-long-SPY beats him at 66.5% WR	High
LLM extraction has ~14% direction errors	86% direction accuracy, 7% hallucination rate	Validated on 15 transcripts (100+ ideas)	Moderate impact

Actionable Takeaways (Validated)

What survived all validation tests:

His gold timing is real — but you don't need him for it. Just buying GLD on the dates he mentions gold performs identically. His value-add is when to pay attention to gold, not how to trade it.
Long conditional breakouts (weeks) are his best skill-based edge — 70.8% WR that holds out-of-sample, though only marginal after Bonferroni (p_adj=0.096).
Ignore his short calls — 40% WR holds in-sample and out-of-sample. This is his most reliable signal: do the opposite.
Ignore his confidence levels — r=0.02 with returns. Not predictive.
Give his ideas time to work — his 20-day results beat his 5-day results consistently.
His overall output is worse than just buying SPY — always-long-SPY at 66.5% WR crushes his 52.5%.

Fair caveats on data quality:

LLM extraction errors: 14% direction inversions, 7% hallucination rate, 66% recall. Our backtest is working with noisy, incomplete data.
Futures-to-ETF proxies introduce tracking error (especially for leveraged/inverse products like UVXY).
"Already in" positions are measured from Monday open, not actual entry — this penalizes active positions.
Options structures are evaluated directionally, which understates performance for well-structured spreads.
Survivorship in gold thesis: Peter may say "long gold" in every video, but our LLM only captures it sometimes. His gold "edge" may partly reflect extraction bias toward his most emphatic mentions.

What this analysis can and cannot do: This is a sniff test, not a definitive audit. With 74% extraction precision and 66% recall, individual trade entries should not be trusted without manual verification. But the large-sample patterns (overall randomness, shorts losing money, gold timing, confidence not predicting outcomes) are robust to these error rates because they rely on n=200-900 observations where directional noise averages out. The small-sample findings (individual tickers with n<30) are suggestive, not definitive.