Pastebin
API
tools
faq
paste
Login
Sign up
Please fix the following errors:
New Paste
Syntax Highlighting
(* The transcript below is from earlier today, when we were fielding a lot of questions and decided it was best to go ahead and give an explainer on the way bootrom dumping on the Switch took place. It's not quite a writeup or an in-depth technical document, but hopefully it can serve as a basic explanation of how we did what we did and why. ) RobocopBOT - Today at 6:27 PM :lock: Channel locked down. Only staff members may speak. Do not bring the topic to other channels or risk disciplinary actions. hedgeberg - Today at 6:28 PM Ok so, lock is temporary just for the purposes of this explainer, lets get started. First off, lets explain what exactly bootrom /is/ and why it cant be obtained by other means ktemkin - Today at 6:28 PM Suggestion: keep this in a nice condensed message as much as possible so we can pin it when you're done? hedgeberg - Today at 6:29 PM Oh god no this is not going to fit in one message >.< Lets pastebin the whole deal afterwards and post the link. ktemkin - Today at 6:29 PM okay, sure-- that works too d'you want me to proofread you and refine what you're saying, or do you just want to go? hedgeberg - Today at 6:29 PM Let me know if i make any glaring mistakes or if i missed something, absolutely So, lets talk tegra: when it comes to the tegra, all execution starts with the BPMP, the Boot and Power Management Processor (an ARM7TDMI). For those of you who have read the writeup on jamais vu, that should sound familiar: its the same thing @SciresM pwned for TZ exec on 1.0. Basically, the BPMP is responsible for setting up everything on the TX1 so that things are safe and ready to execute when it comes time for the OS to start running on the AP (Applicaiton Processor). The AP is a 4-core aarch64 processor where almost all of the OS runs, and for the rest of this we won't really be talking about it because everything of interest that we're about to discuss takes place before that is even active. So, when we talk about bootrom, we're talking about the very first code to execute on the BPMP. It's a set of instructions burnt into the silicon on the physical layer using whats called a "maskrom", basically a set of metal wires that are used to encode 1's or 0's. In addition, theres a set of "eFuses", i.e. fuses that are used to permanently and irreversibly store information at the factory before the device is shipped. The eFuses contain patches for the bootrom so that NVIDIA can fix bootrom bugs before shipping the SoC without fabricating a whole new set of masks and redoing the whole production of the chip. In addition, the Secure Boot Key (aka SBK) and other important console information are stored in eFuses All that being said, i think we're good to dive in: The bootrom is usually a big mess of spaghetti code and it is responsible for a /lot/ of important stuff. For example, the bootrom reads the BCT (boot config table) from the NAND Flash and uses that to figure out if whats in the NAND is safe to boot. In the Switch, only BCT entries that are signed by Nintendo can be loaded, because the TX1 verifies the BCT entry using RSA crypto. This is high-responsibility and complex code, it's usually where a lot of neat vulnerabilities end up, and manufacturers know this, so they lock out access to it as part of the bootrom before they pivot to execute what we'd call the "stage 1 bootloader" or Package1 on the switch. The idea is that if hackers dont have access to the code, they can't find vulnerabilities. That leaves us 2 options: 1, somehow find an exploit in the bootrom that allows us to dump the bootrom before lockout, and do this without the actual code (basically, fuzzing all possible inputs from any possible source), which is basically shooting blind, or 2, somehow preventing the bootrom lockout from actually taking place and/or "tricking" the processor into taking a path or a set of instructions it never would have if operated correctly. (there is also a 3rd option, decapsulation or "decapping" and imaging, but that is an absurdly expensive process that is far beyond the reach of a team like ours) 1 is, well, possible, but outright terrible. The TX1 is a really, really complex SoC with thousands of different potential fuzzing routes. As such, 2 is the ideal. But the question becomes then, how do we actually cause 1 of the 2 situations listed under option 2 to happen? Ideally, we'd reach a point where when our code begins to execute, the bootrom is just not locked out, and we can slurp it right out when our code begins to execute. That means we need 2 things: the actual lockout skip, and code executing that can read things after the lockout skip. So, for the second part, our life is made a lot easier by not using the Switch, since for code to execute on the switch, it needs to be signed by Nintendo. We instead use the Jetson TX1, which is a dev board for the same SoC as used in the switch, and (conveniently!) uses the same bootrom and patch set, too. (*edit: When writing this, I had forgotten that there are a couple more patches on the Switch than the Jetson, meaning that the bootrom is the same but the patch set is slightly different. Sorry, guys!) That takes care of code execution on the BPMP, we can run whatever we want there after the bootrom, but as for the bootrom lockout, we're still not well-off. Actually rq, to make sure im not forgetting anything, @SciresM im not wrong in thinking bootrom lockout is done by the bootrom and not package1 on the switch, right? (2 minutes go by) Ok, I'm going to assume I'm remember correctly and that it hasnt been too long, and that the bootrom lockout is in fact done in the bootrom itself on the Jetson. The basic idea is that we need to "skip" an instruction, or corrupt its operation so completely that it doesnt produce the desired result. In this case, that would be the store operation in the BPMP thats used to write to the bootrom lockout register. If we skip that instruction, that register is never written to, and when we get execution we can read from the space the bootrom is in just liked we'd read from any other region of memory. TuxSH - Today at 6:56 PM @hedgeberg I'm not SciresM but I can confirm that the bootrom in fact disables itself by writing 0x1C (0b11100) to SB_CSR_0 just before jumping to the entrypoint (bit 3-2: reserved, bit4: "PIROM_DISABLE: Protected iROM Disable", see page 619 of the TRM) ...ah, hm, I may be wrong about the value written, it's actually 0x10. So yeah, SB_CSR_0 = PIROM_DISABLE hedgeberg - Today at 6:58 PM thanks @TuxSH wanted to make sure i wasnt being a dumbass SciresM - Today at 6:58 PM Correct bootrom locks itself out. hedgeberg - Today at 6:59 PM Ok, so, the idea is that if we can somehow hop over that instruction while letting the bootrom continue to execute, when we land in the BPMP firmware that we control on the Jetson, we'll be able to read the bootrom right out of memory. hedgeberg - Today at 7:00 PM This is where "glitching" comes in. Glitching is basically the process of injecting some sort of controlled noise (yes, thats a misnomer) on the power supply to corrupt the operation of the processor. Processors are designed to work within a certain range of operations, with a certain set of environmental constants. So, for example, they usually expect to work in a range of -25C to 100C or some such range, and outside of that range their behavior can become unpredictable and erratic. ...which would be nice for our purposes if it was unpredictable in the way we wanted, but that kind of temperature-based operating corruption isn't very controllable or predictable, and it isn't precise. However, with an FPGA (field progrmmable gate array) and a large high-speed mosfet, we can very quickly and precisely change the power supply voltage that's made available to the BPMP. This kind of glitching is called a "brownout" or "crowbar" glitch, the idea being that you're forcing the rail to the value you want for a short period of time. If the BPMP doesnt have a high enough voltage that it can draw the power needed to operate correctly, data inside of it will corrupt. Pull the voltage down for too long, and you'll shut down the processor like how an engine would stall if you cut off the gas supply. Pull the voltage down for too little time, and the internal capacitances that exist on the gates of the transistors inside the SoC wont have time to actually change. But, if you hit a sweet spot in terms of timing, you can create a situation where just a few bits might change in different places. Let's say, for example, you corrupted the Program Counter register just as it was incrementing at just the right time. You could cause it to over-increment, skipping over the instruction. That event isnt likely to actually happen, compared to other things that could, but its an easy example to conceptualize. The thing about modern processors is that they're extremely complex. Executing one instruction takes long pipelines of pre-processing, and different stages in those pipelines can have different effects if browned-out. However, since we dont have the bootrom, we dont know where exactly the instruction we'd want to skip even is timewise. So, for that specific glitch, once you have the process of modifying the board to be ready for the attack, and you have your payload ready to run on the bpmp, it basically comes down to fuzzing. You need to find the right width of the glitch and the right position in time to actually cause the desired corruption, so you want something to automate that process. For our attack on the Jetson, we used the chipwhisperer, which is a board by a hardware hacker named Colin O'Flynn designed to make the automation process easier. From there, it was just a question of poking around at different times randomly until, on one reboot, the payload reported that it could in fact read from the bootrom space. After that it was as simple as dumping it out over UART. This isn't the only thing we ended up glitching, we also worked on glitching to get code execution on the Switch BPMP to get things like keyblobs, as did Derrek, but thats a much more complex process. Instead of targetting a register write, we targetted a branch that checked whether or not the RSA key stored in a BCT entry was correct, tricking the processor into accepting a BCT that we signed instead of Nintendo. Ok, im assuming there are questions, especially about some parts i glossed over, so post 'em in #hack-n-all and ill try to respond to em @ktemkin while i do that, did i miss anything? ktemkin - Today at 7:16 PM just a sec, parsing ( * user @MrSlick asks: "when you get around to questions i have one. how does what you glitched out differ from the nvtboot_recovery dump you have been reversing on stream") hedgeberg - Today at 7:17 PM kk, so the first question I saw was from @MrSlick, asking about nvtboot_recovery ktemkin - Today at 7:17 PM Okay, worth noting: the most electron movement happens in places where transistors are switching-- so areas with transistors switching are more significantly likely to be affected by a voltage glitch. The more you have switching, the more likely an area is to be affected. hedgeberg - Today at 7:18 PM nvtboot_recovery is more like package1 on the switch. its not the bootrom, its more like a firmware binary ktemkin - Today at 7:18 PM This means that voltage glitches generally are best for corrupting the output of more significant computations. hedgeberg - Today at 7:18 PM we're analyzing that on the stream because its similar, but not the same yes, thx @ktemkin, that is a good point, voltage glitching is better for attacking things like hashes and crypto operations etc, things that use a lot of power ktemkin - Today at 7:19 PM Or for affecting things like next-address computations hedgeberg - Today at 7:19 PM Yeah, I just didnt want to get too much into the physics so much as give an overview. ( * user @Mak asks: "@hedgeberg could you explain more about the 3rd option? and why it's expensive?" (this was much earlier in the post, referring to the 3 options for getting bootrom)) Ok, next question is from @Mak who asked about the imaging/decap solution. So, the thing about modern SoCs is that even the largest objects on any layer are /tiny/. The MOSFETs, for example, the building blocks of logic gates, are 22 nm thick in the Tegra, iirc. 22 nm is smaller than your average bacterium, and 1 nm is about the width of a carbon atom (unless I'm drastically misremembering). To look at that, we need really heavy-duty equipment, not even Scanning Electron Micrsocopes are enough. We generally need something like a Tunneling Electron Microscope (TEM) or Atomic Force Microscope (AFM) to do that kind of imaging, and those tools are not cheap, even to rent time on. In addition, there are usually around 10 layers of metal wiring above the transistors and substrate, and to analyze the chip we need to analyze each layer, meaning we need extremely precise systems for scraping off each layer. Plus, the chip is bonded to a PCB using epoxy that can only be dissolved using dangerous acids, like very-high-concentration (aka "Fuming") nitric acid, so its a safety hazard, a time commitment, and a financial investment, all rolled into one. People do do that kind of reverse engineering, but those systems are expensive so you need a big budget or friends in high places. ( * user @FineTralfazz asks: "when you say a "large" mosfet, what do you mean?") Ok @FineTralfazz asks what I meant by a "large" MOSFET. For glitching, we usually use a MOSFET with really low Rds, meaning the effective impedance between the drain and the source. We want to be able to pull power away from the node as quickly as possible, so smaller Rds is better. In addition, it needs to be fast and able to handle a lot of power, so RF MOSFETS intended for use in stuff like amplifiers work well. ( * user @roblabla asks: "how expensive would it be to replicate the setup - Assuming all we have is a Jetson - in terms of material ?") @roblabla so, to set up the system, you need: -Jetson: $600 -ChipWhisperer: ~$250 -FPGA: ~$100 -Transistors/gate drivers/etc: ~$50 -(recommend adding a logic analyzer and an oscilloscope as this would not be possible without them imo, but those are really expensive) So, total, including the target board, is about $1k. That doesnt mean there arent cheaper ways to do it, but its definitely cost-prohibitive. ( * user @merry asks: "Essentially, summary, fuzz glitch parameters with success condition being your BCT being successfully executed? I had thought it would be more complex.") @merry you asked about BCT, and the bootrom glitch doesnt need BCT, its not terribly complex the BCT glitch is a lot more complex because the timing window is way larger. For that we synchronized using our handy-dandy FPGA to process eMMC commands. When we saw the end of the BCT read via the eMMC i/o, we could use that as a trigger for the chipwhisperer. The most efficient way to do the attack though, in reality, would be to fill up the BCT with entries containing our bogus key and trigger on each attempt (the TX1 reads 64 entries before just giving up). Pinning down the glitch in that is really tricky, you basically have to analyze the time between the end of one read and the start of another to try and figure out what codepaths are being executed. It sounds easy from a high level because overall the concept is pretty simple, admittedly, but getting the system to a point where it was even arguably "working" took me about 200 hrs at least. There's a lot of moving parts that you run into, which is why I tell people its not simple. A background in CPU architecture, PCB design, and Analog Electronics helps /a lot/.
Optional Paste Settings
Category:
None
Cryptocurrency
Cybersecurity
Fixit
Food
Gaming
Haiku
Help
History
Housing
Jokes
Legal
Money
Movies
Music
Pets
Photo
Science
Software
Source Code
Spirit
Sports
Travel
TV
Writing
Tags:
Syntax Highlighting:
None
Bash
C
C#
C++
CSS
HTML
JSON
Java
JavaScript
Lua
Markdown (PRO members only)
Objective C
PHP
Perl
Python
Ruby
Swift
4CS
6502 ACME Cross Assembler
6502 Kick Assembler
6502 TASM/64TASS
ABAP
AIMMS
ALGOL 68
APT Sources
ARM
ASM (NASM)
ASP
ActionScript
ActionScript 3
Ada
Apache Log
AppleScript
Arduino
Asymptote
AutoIt
Autohotkey
Avisynth
Awk
BASCOM AVR
BNF
BOO
Bash
Basic4GL
Batch
BibTeX
Blitz Basic
Blitz3D
BlitzMax
BrainFuck
C
C (WinAPI)
C Intermediate Language
C for Macs
C#
C++
C++ (WinAPI)
C++ (with Qt extensions)
C: Loadrunner
CAD DCL
CAD Lisp
CFDG
CMake
COBOL
CSS
Ceylon
ChaiScript
Chapel
Clojure
Clone C
Clone C++
CoffeeScript
ColdFusion
Cuesheet
D
DCL
DCPU-16
DCS
DIV
DOT
Dart
Delphi
Delphi Prism (Oxygene)
Diff
E
ECMAScript
EPC
Easytrieve
Eiffel
Email
Erlang
Euphoria
F#
FO Language
Falcon
Filemaker
Formula One
Fortran
FreeBasic
FreeSWITCH
GAMBAS
GDB
GDScript
Game Maker
Genero
Genie
GetText
Go
Godot GLSL
Groovy
GwBasic
HQ9 Plus
HTML
HTML 5
Haskell
Haxe
HicEst
IDL
INI file
INTERCAL
IO
ISPF Panel Definition
Icon
Inno Script
J
JCL
JSON
Java
Java 5
JavaScript
Julia
KSP (Kontakt Script)
KiXtart
Kotlin
LDIF
LLVM
LOL Code
LScript
Latex
Liberty BASIC
Linden Scripting
Lisp
Loco Basic
Logtalk
Lotus Formulas
Lotus Script
Lua
M68000 Assembler
MIX Assembler
MK-61/52
MPASM
MXML
MagikSF
Make
MapBasic
Markdown (PRO members only)
MatLab
Mercury
MetaPost
Modula 2
Modula 3
Motorola 68000 HiSoft Dev
MySQL
Nagios
NetRexx
Nginx
Nim
NullSoft Installer
OCaml
OCaml Brief
Oberon 2
Objeck Programming Langua
Objective C
Octave
Open Object Rexx
OpenBSD PACKET FILTER
OpenGL Shading
Openoffice BASIC
Oracle 11
Oracle 8
Oz
PARI/GP
PCRE
PHP
PHP Brief
PL/I
PL/SQL
POV-Ray
ParaSail
Pascal
Pawn
Per
Perl
Perl 6
Phix
Pic 16
Pike
Pixel Bender
PostScript
PostgreSQL
PowerBuilder
PowerShell
ProFTPd
Progress
Prolog
Properties
ProvideX
Puppet
PureBasic
PyCon
Python
Python for S60
QBasic
QML
R
RBScript
REBOL
REG
RPM Spec
Racket
Rails
Rexx
Robots
Roff Manpage
Ruby
Ruby Gnuplot
Rust
SAS
SCL
SPARK
SPARQL
SQF
SQL
SSH Config
Scala
Scheme
Scilab
SdlBasic
Smalltalk
Smarty
StandardML
StoneScript
SuperCollider
Swift
SystemVerilog
T-SQL
TCL
TeXgraph
Tera Term
TypeScript
TypoScript
UPC
Unicon
UnrealScript
Urbi
VB.NET
VBScript
VHDL
VIM
Vala
Vedit
VeriLog
Visual Pro Log
VisualBasic
VisualFoxPro
WHOIS
WhiteSpace
Winbatch
XBasic
XML
XPP
Xojo
Xorg Config
YAML
YARA
Z80 Assembler
ZXBasic
autoconf
jQuery
mIRC
newLISP
q/kdb+
thinBasic
Paste Expiration:
Never
Burn after read
10 Minutes
1 Hour
1 Day
1 Week
2 Weeks
1 Month
6 Months
1 Year
Paste Exposure:
Public
Unlisted
Private
Folder:
(members only)
Password
NEW
Enabled
Disabled
Burn after read
NEW
Paste Name / Title:
Create New Paste
Hello
Guest
Sign Up
or
Login
Sign in with Facebook
Sign in with Twitter
Sign in with Google
You are currently not logged in, this means you can not edit or delete anything you paste.
Sign Up
or
Login
Public Pastes
⭐✅ MAKE $2500 IN 15 MIN A
JavaScript | 5 sec ago | 0.27 KB
⭐✅ Jack's Profit Method ⭐
JavaScript | 6 sec ago | 0.25 KB
⭐✅ Exploit 2500$ in 15 Minutes 9
JavaScript | 12 sec ago | 0.27 KB
⭐✅ Swapzone Glitch ✅ Working
JavaScript | 17 sec ago | 0.25 KB
Free Crypto Method (NEVER SEEN BEFORE)⭐ K
JavaScript | 21 sec ago | 0.27 KB
⭐Swapzone Glitch ✅ Working⭐ S
JavaScript | 29 sec ago | 0.27 KB
⭐✅ Trading Profit Method ⭐
JavaScript | 30 sec ago | 0.25 KB
✅ Make $2500 in 20 minutes⭐⭐ W
JavaScript | 37 sec ago | 0.27 KB
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the
Cookies Policy
.
OK, I Understand
Not a member of Pastebin yet?
Sign Up
, it unlocks many cool features!