my Research
Research & Project
I. Efficient Transformer Acceleration via Reconfiguration for Encoder and Decoder Models and Sparsity-Aware Algorithm Mapping
This project proposes an innovative solution to efficiently support both encoder and decoder stages, maximizing PE utilization and parallelism
Leading front & backend part (RTL design, synthesis & PNR)
Under review 2024 DAC
II. Hyperdimensional Computing with ReRAM ASIC: A Reconfigurable and Energy-Efficient Approach
(Sponsored by PRISM Center, SRC)
Mixed-Signal Chip Fabricated
Front & backend part of analog controller (RTL design, synthesis & PNR, and gds2)
Prototype fabricated in a TSMC 40nm CMOS technology
To be updated
III. HDnn: A Hybrid Hyperdimensional Computing Accelerator for On-Chip Inference and Few-Shot Learning
(Sponsored by PRISM Center, SRC)
Digital Chip Fabricated & Measured
Front & backend part (RTL Design, synthesis & PNR, gds2, and measurement)
Prototype fabricated in a TSMC 40nm CMOS technology with 11.3mm^2 die area
To be published soon
IV. XGBOOST Decision Tree using Modular Unit tree ASIC Design
(Sponsored by KETI, Seongnam-si, Korea)
Digital Chip Fabricated & Measured
Leading front & backend part (Algorithm Simulation, RTL Design, synthesis & PNR, gds2, and measurement)
Prototype fabricated in a TSMC 65nm CMOS technology with 2.8mm^2 die area at the throughput of 1.0 G tree operations/s, 1.16 TOPS, 19 fJ/node, and 52.5 TOPS/W
Accepted by CICC 2024
V. Deep Convolution Neural Network for Predicting the Prognosis of Root Canal Treatment (Class Project)
(ECE 268, UCSD)
Contact for more info
VI. PAHV: A 65nm 1.68TOPS/W NLP SIMD Processor with Pipelined MAC Array, Asynchronous Dual Core, Half Memory Buffer Techniques (Class Project)
(ECE 260B, UCSD)
Contact for more info
VII. Privacy-aware Deep Learning-based Recommendation System (Class Project)
(ECE 268, UCSD)
Contact for more info
VIII. CNNs on Multi-Core 2-D Systolic Array with Versatile Pruning (Class Project)
(ECE 284, UCSD)
Contact for more info
Undergrad Project
I. High Unity Band-Width Two-Stage Amplifier Design
(Bio-Application System Integrated Circuit LAB)
Design and Configuration of High-BW Two-Stage Amplifier using Virtuoso tool:
Expected Spec was 1.8V Supply Voltage, 0.9V Input common-mode, Gain > 80dB, UBW > 1MHz, PM > 60, Power Consumption < 5uW, CMRR & PSRR >100dB, input-referred noise < 1mVrms
Circuit which I designed was Gain = 99dB, UBW = 1.127MHz, PM = 70.5, Power Consumption = 4.65uW, CMRR & PSRR >100dB, input-referred noise 26uVrms: Using Miller Cap & Zero Canceling topologies
[Independent Pilot Study] Designed a Two-stage Amplifier of 1.13MHz unity gain BW & 3.02uW power consumption using Sub-threshold region
Circuit which I designed was Gain = 98dB, UBW = 1.13MHz, PM = 60.6, Power Consumption = 3.02uW, CMRR & PSRR >100dB, input-referred noise 25uVrms: Benefit by Less power consumption since mosfets on Sub-threshold region consumpt less power
II. Advanced LDO Regulator using Self Cascode Error Amplifier
(Bio-Application System Integrated Circuit LAB)
Academic Paper (Graduation Thesis)
Designed Conventional LDO Linear Regulator ICs using Virtuoso tool:
It was difficult to design a stable regulator because the load cap was relatively small: Using Super Source Follower & Feed Forward for a stable Phase margin
Designed Advanced LDO Linear Regulator ICs with Self Cascode Error Amplifier using Virtuoso tool:
Using Self Cascode Error Amplifier, gain and unity band width were increased to improve transient response.
III. Speaker driver circuit including Low Pass Filter
Amplifying the Input sounds and Filtering out the noise above 10 kHz frequency; Simulation by soldering, the output noise was well filtered without any noise
Designed LPF Circuit using PSPICE; Manufactured PCB using PADS tool; Arrange devices using PADS Layout; Check the error in clearance and connectivity using Verify Design
IV. 2D-DCT in JPEG Image Compression Hardware
(VLSI Signal Processing LAB)
Designed an area & power-efficient 2D-DCT in JPEG Image Compression Hardware in Verilog
Using Ping-Pong structure and optimizing Transpose Memory & 2D-DCT by removing high- frequency components of the images: Selected as the best work by achieving the most area-efficient module
Using the symmetry of coefficient matrix and Computing the optimized bit length for a bit-truncation to minimize the number of modules
V. Area-efficient De-Interleaver Hardware Architecture
(VLSI Signal Processing LAB)
De-interleaving the Interleaved input data and optimizing the architecture by finding the patterns: Achieved twice the effective area and power consumption than the class average
Designed De-Interleaver in Verilog using Modelsim & Synthesis by Design Vision tool
VI. Design MD5 Hash Function
(VLSI Signal Processing LAB)
Random length of input message -> Fixed output digest: one-way hash function
Designed the MD5 Architecture which functions specific encryption in Verilog
VII. Conducting Many Projects about Digital Integrated Circuits Design from Schematic to Layout
(VLSI Signal Processing LAB)
Using Cadence Virtuoso CAD tool for designing and layout
Maybe upload soon about it more
VIII. Enhanced Stability of 8T and 10T SRAM over 6T SRAM
(VLSI Signal Processing LAB)
Design 8T & 10T SRAM to alleviate data conflict caused by sharing path between read and write operation
Additional Transistors separate SRAM's read & write operation; reduced data conflict
Check enhanced stability compared with 6T SRAM with Monte Carlo Simulation
8T & 10T SRAM showed much less Failure rate than 6T SRAM
Failure Rate of 6T SRAM: 100%
Failure Rate of 8T SRAM: 49.9%
Failure Rate of 10T SRAM: 61.5% (Read fail condition: when Vdd = 0.6V, RSNM < 0.23685V)
IX. Two Different Gate Length(280nm, 240nm) MOSFETs
Design Two Different Gate Lenght MOSFETs using TCAD tool
Analyzed the device performances and parameters (ID-VD Characteristics, ID-VG Characteristics, Vth, ISAT, and DIBL)