|
CHUNG-PING CHUNG, PROFESSOR 鍾崇
斌
Education:
June 1981 - August 1986
Ph. D. in electrical engineering, major field:
digital, Texas A&M University, College Station, Texas, U. S.
A.
January 1980 - May 1981
M. E. in electrical engineering, major field: digital,
Texas A&M University, College Station, Texas, U. S. A.
October 1972 - June 1976
B. E. in electrical engineering, National Cheng-Kung
University, Taiwan, R. O. C.
Professional
Background:
August 1978 - August 1979
Programmer, Tatung Company, Taipei, Taiwan, R.
O. C.
September 1979 - December 1979
Programming Consultant, Computer Center, University
of Detroit, Detroit, Michigan, U. S. A.
September 1980 - December 1985
Teaching Assistant, Department of Electrical Engineering,
Texas A&M University, College Station, Texas, U. S. A.
January 1986 - August 1986
Lecturer, Department of Electrical Engineering,
Texas A&M University, College Station, Texas, U. S. A.
August 1986 to July 1993
Associate Professor, Department of Computer Science
and Information Engineering, National Chiao Tung University, Hsinchu,
Taiwan, R. O. C.
September 1991 to July 1992
Visiting Associate Professor, Department of Computer
Science, Michigan State University, East Lansing, Michigan, U. S.
A.
August 1993 to date
Professor, Department of Computer Science and Information
Engineering, National Chiao Tung University, Hsinchu, Taiwan, R.
O. C.
Research Interests:
Computer architecture, parallel processing, VLSI
system design, system simulation.
Research
Plans
1. Computer architecture design:
Contemporary processor design trends will be studied,
and feasible new computer architectures will be proposed. Research
topics include: uniprocessor architectural design, multiprocessor
architectural design, distributed systems architectural design, and
massively parallel processor architectural design.
2. Study of parallel processing:
Based on the characteristics of different granularities
in exploiting parallelism, the principles and methodologies of the
various parallel processing techniques will be investigated. Such
techniques include multiprocessing, multithreaded processing, concurrent
instruction execution, and pipelined instruction execution.
3. Study of compiler techniques:
This topic concerns the parallelizing compiler design,
which outlines the most difficult and challenging issues in high-performance
computer systems design. Issues to be tackled should involve instruction
scheduling, register allocation, procedure partitioning, synchronization,
etc. Both modeling and simulation will be used in this aspect in different
scenarios.
4. Parallel memory design:
As the design philosophy of computers evolve rapidly,
there is always a great deficiency in memory design. Research interests
here include designs of multiple, multilevel, or multibank cache memories,
memory allocation, data distribution/consistency, and memory interleaving
in the various memory hierarchies.
5. System integration:
This topic concerns the task of system design by
putting altogether the research efforts on the many system components.
As a result, some basic characteristics of operating system need also
be studied. And the simulation and cost/performance analysis and evaluation
will be stressed.
Publication
List
A. Refereed Papers
- Chung-Ping Chung, Shyi-Chyi Cheng, Hong-Chich Chou, and Cheng
Chen, “Design of The Dual-ALU CRISC and Its Concurrent Execution,”
Journal of Information Science and Engineering, Vol. 5, No
3, pp. 251-274, July 1989.
- Hong-Chich Chou, Chung-Ping Chung and Shyi-Chyi Cheng, “Dual-ALU
CRISC Architecture and Its Compiling Technique,” Computers and
Electrical Engineering, Vol. 17, No. 4, pp. 297-312, 1991.
- Cheng Chen, Chung-Ping Chung, Cheng-Chin Chian, Hsin-Chia
Fu, and S. J. Wang, “An Or-Parallel Inference Model Based on Multi
RISC-Style Processing System,” Journal of Information Science and
Engineering, Vol. 7, No. 4, pp. 487-512, Dec. 1991.
- Hong-Chich Chou and Chung-Ping Chung, “A Bound Analysis of
Scheduling Instructions on Pipelined Processors with a Maximal Delay
of One Cycle,” Parallel Computing, Vol. 18, No. 4, pp. 393-399,
April 1992.
- Yuh-Horng Shiau and Chung-Ping Chung, “Adoptability and Effectiveness
of Microcode Compaction Algorithms in Superscalar Processing,” Parallel
Computing, Vol. 18, No. 5, pp. 497-510, May 1992.
- Yuh-Horng Shiau and Chung-Ping Chung, “The Statistical model
of CRAY X-MP Vector Accesses,” Journal of Chinese Institute of
Engineers, Vol. 15, No. 5, pp. 611-616, September 1992.
- Chung-Ping Chung and Wen-Yang Lin, “Vectorization of Sorting
Algorithms,” International Journal of High Speed Computing,
Vol. 4, No. 3, pp. 213-232, September 1992.
- Hong-Chich Chou and Chung-Ping Chung, “Upper Bound Analysis
of Scheduling Arbitrary-Delay Instructions on Typed Pipelined Processors,”
International Journal of High Speed Computing, Vol. 4, No.
4, pp. 301-312, Dec. 1992.
- Ren-Lianq Cheng and Chung-Ping Chung, “Reaching Approximate
Agreement on Hypercube,” Parallel Computing, Vol. 19, No. 7,
pp. 765-775, July 1993.
- Ruey-Liang Ma, Chung-Ping Chung and Cheng Chen, “A Register
Window Scheduling Method for Prolog,” Journal of the Chinese Institute
of Engineers, Vol. 16, No. 6, pp. 793-806, November 1993.
- Hong-Chich Chou and Chung-Ping Chung, “Modeling of Superscalar
Instruction Scheduling and Analysis of a Heuristic Scheduling Algorithm,”
BIT, Vol. 33, pp. 354-371, 1993.
- Yuh-Horng Shiau and Chung-Ping Chung, “Effects and Handling
of Instruction Class Contention in Superscalar Processing,” International
Journal of High Speed Computing, Vol. 6, No. 3, pp. 357-373, Sep.
1994.
- Yuh-Horng Shiau and Chung-Ping Chung, “Effects of Hardware
Enhancements of Superscalar Performance,” Journal of Information
Science and Technology, Vol. 2, No. 3, 1993.
- Yuh-Horng Shiau and Chung-Ping Chung, “Benchmarking and Analysis
of Superscalar Architecture,” Journal of the Chinese Institute
of Engineers, Vol. 17, No. 2, pp. 169-177, March 1994.
- Hong-Chich Chou and Chung-Ping Chung, “Optimal Multiprocessor
Task Scheduling Using Dominance and Equivalence Relations,” to appear
in Computers and Operations Research, Vol. 21, No. 4, pp. 463-475,
1994.
- Hong-Chich Chou and Chung-Ping Chung, “An Optimal Instruction
Scheduler for Superscalar Processor,” to appear in IEEE Transactions
on Parallel and Distributed Systems, Vol. 6, No. 3, pp. 303-313,
March, 1995.
- Ruey-Liang Ma and Chung-Ping Chung, “Periodic Adaptive Branch
Prediction for Superscalar Processing in Prolog,” the Computer
Journal, Vol. 38, No. 6, pp. 457-470, 1995.
- Ruey-Liang Ma and Chung-Ping Chung, “Branch Prediction for
Enhancing Fing-grained Parallelism in Prolog,” Journal of Information
Science and Technology, Vol. 4, No. 2, 1995.
- Hong-Chich Chou and Chung-Ping Chung, “On the Upper Bound
of Scheduling Instructions on Pipelined Processors with Delay,” Journal
of the Chinese Institute of Engineers, Vol. 18, No. 1, pp. 101-108,
Jan. 1995.
- Neng-Pin Lu and Chung-Ping Chung, “Memory System Design in
Superscalar Processing,” International Journal of High Speed Computing,
Sep. 1995, Vol. 7, No. 3, pp. 421-443.
- Ren-Liang Cheng and Chung-Ping Chung, “An Approximate Agreement
Algorithm for Wraparound Meshes,” International Journal of High
Speed Computing, Vol. 7, No. 3, pp. 407-419, Sep. 1995.
- Neng-Pin Lu and Chung-Ping Chung, “A Fault Tolerant Multistage
Combining Network,” Journal of Parallel and Distributed Computing,
Vol. 34, pp. 14-28, 1996.
- Ren-Liang Cheng and Chung-Ping Chung, “Local Interactive
Convergence on Hypercube,” International Journal of Computers &
Applications, Vol. 19, No. 1, pp. 1-5, 1997.
- Tang-Show Hwang, Neng-Pin Lu
and Chung-Ping Chung, “Delay Precise Invalidation -
A Software Cache Coherence Scheme,” IEE Proceedings: Computers
and Digital Techniques, Vol. 143, No. 5, pp. 337-344, Sep. 1996.
B. Conference Papers
- C. P. Chung, T. T. Tsai, et al., “VLSI Design of Fast RISC-Style
Prolog Machine,” Proceedings of 1987 International Symposium on
VLSI Technology, System, and Applications, Taipei, Taiwan,
R. O. C., May 13-15, 1987, pp. 369-372.
- C. P. Chung, H. C. Fu, et al., “Study of Artificial Intelligence
Multiprocessor System,” Proceedings of the Seventh Workshop on
Computer System Technology, Nantou, Taiwan, R. O. C., Aug 12-15,
1987, pp.307-354.
- C. P. Chung, C. C. Chiang, et al., “A Further Performance
Evalua-tion on LISCP, A Fast RISC-Style Prolog Machine,” Proceedings
of National Computer Symposium 1987, Taipei, R. O. C., Dec. 17-18,
1987, pp. 11-20.
- C. P. Chung, T. T. Tsai, et al., “VLSI Design and Implementation
of LISCP--A Fast RISC-Style Prolog Machine,” Proceedings of National
Computer Symposium 1987, Taipei, R. O. C., Dec. 17-18, 1987, pp.
30-39.
- C. P. Chung and R. L. Cheng, “An Efficient Cache Consistency
Protocol for the Shared-Bus Multiprocessor Systems,” Proceedings
of National Computer Symposium 1987, Taipei, R. O. C., Dec. 17-18,
1987, pp. 95-101.
- C. P. Chung, C. C. Chung, et al., “A Study of Parallel Execution
Model for Prolog on a RISC-Style Multiprocessor System,” Proceedings
of National Computer Symposium 1987, Taipei, R. O. C., Dec.17-18,
1987, pp. 102-111.
- C. P. Chung, H. C. Fu, et al., “A Multiprocessor System for
Prolog Processing,” Proceedings of the Second IEEE Conference on
Computer Workstations, Santa Clara, CA, U. S. A., Mar. 7-10, 1988,
pp. 60-69.
- C. P. Chung, H. C. Fu, et al., “The Study of AI Multiprocessor
System,” Proceedings of The Eighth Workshop on Computer System
Technology, R. O. C., Aug. 7-9, 1988, pp. 101-124.
- C. P. Chung, S.C. Jeng and C. Chen, “Design of the Dual-ALU
CRISC and Its Dual-Stream Instruction Execution,” Proceedings of
Acer Student Thesis Awards, Taipei, R. O. C., Sept 1988, pp. 115-137.
- C. P. Chung, Z. C. Hwang, et al., “Memory Subsystem of the
MCRISC,” Proceedings of International Computer Symposium 1988,
Taipei, R. O. C., Dec. 15-17, 1988, pp. 342-348.
- C. P. Chung, H. C. Chow, et al., “The Study and Realization
of The CRISC Code Compaction Methodology,” Proceedings of International
Computer Symposium 1988, Taipei, R. O. C., Dec. 15-17, 1988, pp.
349-354.
- C. P. Chung, C. C. Chiang, et al., “Design and Implementation
of a Feasible Run-Time Intelligent Backtracking Scheme for Prolog,”
Proceedings of International Computer Symposium 1988, Taipei,
R. O. C., Dec. 15-17, 1988, pp. 659-664.
- C. P. Chung, T. C. Chang, et al., “Design of LISCP-II: An
Improved RISC-Style Processor for Prolog,” Proceedings of International
Computer Symposium 1988, Taipei, R. O. C., Dec. 15-17, 1988, pp.
665-670.
- C. P. Chung, Y. M. Hsu, et al., “A New AND-Parallel Execution
Model for Prolog--Forward Execution Method,” Proceedings of International
Computer Symposium 1988, Taipei, R. O. C., Dec. 15-17, 1988, pp.
800-805.
- C. P. Chung and S. J. Fu, “Degenerate Corner Stitching —
A Data-Structuring Technique for Interactive VLSI Layout Tools,” Proceedings
of International Computer Symposium 1988, Taipei, R. O. C., Dec.
15-17, 1988, pp. 1123-1128.
- C. P. Chung, Q. Z. Wu, and K. T. Sun, “ANDROR: A Parallel
Execution Model for Logic Programs,” Proceeding of IASTED Expert
Systems Theory and Applications, Zurich, Switzerland, June 26-28,
1989.
- C. P. Chung, “Study and Design of A High-Speed Computer Architecture,”
Proceedings of MIST Workshop of 1989, Hsinchu, Taiwan, R. O.
C., Oct 17 and 18, 1989, pp. C1-1 to C1-21.
- C. P. Chung and R. L. Ma, “Design and Considerations of A
Prolog Compiler for LISCP-II,” Proceedings of Acer Student Thesis
Awards, Taipei, R. O. C., Oct 1989, pp.347-358.
- C. P. Chung and Y. H. Shiau, “Study of Cray X-MP Vector Accesses
Using Different Storage Schemes,” Proceedings of National Computer
Symposium 1989, Taipei, R. O. C., Dec 21 and 22, 1989, pp. 385-394.
- C. P. Chung and S. W. Tung, “Performance Evaluation of LISCP-II,
a Prolog Machine,” Proceedings of National Computer Symposium 1989,
Taipei, R. O. C., Dec 21 and 22, 1989, pp.478-487.
- C. P. Chung and S. W. Tung, “Register File Design of LISCP-II,
a Prolog Machine,” Proceedings of National Computer Symposium 1989,
Taipei, R. O. C., Dec 21 and 22, 1989, pp. 564-573.
- C. P. Chung and R. L. Ma, “Dynamic Database Management of
LISCP-II Prolog Compiler,” Proceedings of National Computer Symposium
1989, Taipei, R. O. C., Dec 21 and 22, 1989, pp. 584-593.
- Q. Z. Wu, K. T. Sun, J. Z. Lin and C. P. Chung “ANDROR: A
Parallel Execution Model for Logic Programs,” Proceedings of National
Computer Symposium 1989, Taipei, R. O. C., Dec. 21-22, 1989, pp.
657-665.
- I. K. Chou, C. P. Chung and C. Chen, “A Loop Partitioning
Method for Multitasking in a High Speed Multiprocessing Architecture,”
Proceedings of MIST Workshop of 1990, Hsinchu, Taiwan, R. O.
C., Oct. 17, 1990, pp. C5-1-C5-29.
- C. P. Chung and C. H. Tsai, “Study of Vectorization of FFT,”
Proceedings of International Computer Symposium 1990, Hsinchu,
Taiwan, R. O. C., Dec.17-19, 1990.
- I. K. Chou, C. P. Chung and C. Chen, “A Dependence-Based
Loop Partitioning Method for Multitasking in a Vector Computer,” Proceedings
of International Computer Symposium 1990, Hsinchu, Taiwan, R.
O. C., Dec. 17-19, 1990.
- C. Chen, C. P. Chung, C. C. Chiang, H. C. Fu, T. C. Chang,
R. L. Ou and S. J. Wang, “Parallel Inference Model Based on Multiple
RISC-Style Processing System,” Proceedings of First Workshop on
Parallel Processing, Hsinchu, Taiwan, R. O. C., Dec. 20-21, 1990.
- C. P. Chung, Y. K. Chen and Y. H. Shiau, “A Hardware Approach
to Parallel Instruction Decoding and Issuing,” Proceedings of National
Computer Symposium 1991, Chungli, Taiwan, R. O. C., Dec. 1991,
pp. 117-124.
- R. L. Ma, C. P. Chung and C. Chen, “A Register File management
Method for Prolog System,” Proceedings of 1992 International Computer
Symposium, Taichung, Taiwan, R. O. C., Dec. 13-15, 1992, pp. 127-134.
- H. C. Chou and C. P. Chung, “Optimal Multiprocessor Task
Scheduling Using Dominance and Equivalence Relations,” Proceedings
of 1992 International Computer Symposium, Taichung, Taiwan, R.
O. C., Dec. 13-15, 1992, pp. 707-714.
- Y. H. Shiau and C. P. Chung, “Effects of Class Conflicts
in Superscalar Processing,” Proceedings of 1992 International Computer
Symposium, Taichung, Taiwan, R. O. C., Dec. 13-15, 1992, pp. 1182-1188.
- C. Z. Lin, C. C. Tseng and C. P. Chung, “Analyzing Cache
Performance on Multi-Stream Execution Processor,” Proceedings of
IEEE TENCON'93, Beijing, China, Oct. 19-21, 1993.
- N. P. Lu and C. P. Chung, “Memory System Design in Superscalar
Processing,” Proceedings of National Computer Symposium 1993,
Chaiyi, Taiwan, R. O. C., Dec. 1993, pp. 95-105.
- C. P. Chung and Y. H. Shiau, “Constructing Register Live
Ranges with Maximum Instruction Parallelism Retained,” Proceedings
of National Computer Symposium 1993, Chaiyi, Taiwan, R. O. C.,
Dec. 1993, pp. 901-911.
- N. P. Lu, T. S. Hwang and C. P. Chung, “Design of Memory
System Supporting Speculative Store,” Proceedings of 1994 Workshop
on Computer System Applications, Nantou, R. O. C., April 22 and
23, 1994, pp. 33-37.
- R. L. Ma, D. L. Liu and C. P. Chung, “Reducing Branch Overhead
and Enhancing Fire-Grained Parallelism in Prolog System,” Proceedings
of 1994 Workshop on Computer System Applications, Nantou, R. O.
C., April 22 and 23, 1994, pp. 38-42.
- Y. Y. Chiang, C. Wu and C. P. Chung, “Implementation of A
Low Cost Distributed Debugger,” Proceedings of 1994 Workshop on
Computer System Applications, Nantou, R. O. C., April 22 and 23,
1994, pp. 43-47.
- Y. Y. Chiang, C. Wu and C. P. Chung, “The Design of a Computer
Assisted Instruction System for Computer Organization Learning,” Proceedings
of 1994 International Conference on Engineering Education, Taipei,
pp. 229-234.
- C. P. Chung and R. L.
Ma, “The Analysis of Instruction Level Parallelism in Prolog Superscalar
Processor,” 第一屆三軍官校基礎學術研討會論文集,
高雄鳳山,
中華民國八十三年六月三日,
pp. 699-706.
- T. S. Hwang, N. P. Lu and C. P. Chung, “A Software-Based
Cache Coherence Scheme with Delay Invalidation,” Proceedings of
1994 Workshop on Advanced Information Systems, Hsinchu, Taiwan,
R. O. C., May 25, 1994, pp. 47-67.
- Y. Y. Chiang, C. Wu and C. P. Chung, “A Distributed System
Model for the Training Simulator of Marine Diesel Propulsion Systems,”
Proceedings of The Third International Conference on Automation
Technology, Taipei, Taiwan, R. O. C., July 1994, Vol. 6, pp. 69-73.
- R. L. Ma and C. P. Chung, “The Simulation and Analysis of
Instruction Level Parallelism in Prolog Superscalar Processor,” Proceedings
of International Computer Symposium 1994, Hsinchu, Taiwan, R.
O. C., Dec. 12-15, 1994, pp. 55-60.
- N. P. Lu and C. P. Chung, “A Cache Coherence Protocol for
Speculative Execution in Multiprocessors,” Proceedings of International
Computer Symposium 1994, Hsinchu, Taiwan, R. O. C., Dec. 12-15,
1994, pp. 179-186.
- C. W. Chen and C. P. Chung, “Time Interval-Based Coloring
Approach to Register Allocation,” Proceedings of International
Computer Symposium 1994, Hsinchu, Taiwan, R. O. C., Dec. 12-15,
1994, pp. 315-321.
- T. S. Hwang and C. P. Chung, “Delay Precise Invalidation
-- A Software Cache Coherence Scheme,” Proceedings of the 1994
International Conference on Parallel and Distributed Systems,
Hsinchu, Taiwan, R. O. C., Dec. 19-21, 1994, pp. 524-529.
- R. L. Ma and C. P. Chung, “Branch Prediction for Enhancing
Fine-Grained Parallelism and Speedup Prolog Execution,” Proceedings
of the 1994 International Conference on Parallel and Distributed Systems,
Hsinchu, Taiwan, R. O. C., Dec. 19-21, 994, pp. 744-751.
- C. P. Chung and N. P.
Lu, “A Speculative Memory Access Technique: Speculative Store,” 八十四年度陸軍官校電機資訊基礎學術研討會,
1995, pp. 159-166.
- R. L. Ma and C. P. Chung, “Architectural Tradeoffs between
SPS--a Superscalar Prolog System and PUMTS--a Parallel Unification
Multi-Thread System,” Workshop on CPU Research and Development,
1995, pp. 3-10.
- Kelvin Lin, N. P. Lu, Y. C. Ma, Pei Ouyang and C. P. Chung,
“A Cache Coherence Protocol for Clustered Multiprocessors,” Workshop
on Distributed System Technologies and Applications, 1995, pp.
51-57.
- C. P. Chung, N. P. Lu, and T. S. Hwang, “Study of Memory
System Design for Superscalar Multiprocessors,” Workshop on High
Performance Multiprocessor Systems, 1995, pp. 3-7.
- C. C. Liu, R. M. Shiu and C. P. Chung, “Register Renaming
for x86 Superscalar Design,” Proceedings of the 1996 International
Conference on Parallel and Distributed Systems, Tokyo, Japan,
June 3-6, 1996, pp. 336-343.
- Neng-Pin Lu and Chung-Ping Chung, Apr. 1996, “Speculative
Store in Distributed-Memory Multiprocessors,” Proceedings of 1996
International Conference Computer Systems Technology for Industrial
Applications, pp. 81-88.
- Neng-Pin Lu and Chung-Ping Chung, May 1996, “Evaluating Cache-Coherent
Write-Policies for Speculative Memory Access,” Proceedings of 1996
Workshop on Distributed System Technologies and Applications,
pp. 91-98.
- Hsiou-Ping Tsai, Yen-Yuan Chiang, and Chung-Ping Chung, Kong
Dar Fan, July 1996, “A Novel Medium Access Control Protocol and Its
Implementation for Wireless PCNs,” Second Workshop on Real Time
and Media Systems, pp. 41-47.
- Kelvin Lin, Neng-Pin Lu, Yeong-Chang Maa, and Chung-Ping Chung,
December 1996, “Enhancing The SCI Cache Coherence Protocol for Multiprocessor
Clusters,” Proceedings of International Conference on Computer
Architecture, pp. 185-192.
- W. Y. Shieh, Y. U. Chiang, and C. P. Chung, January 1997,
“A hypercube-Style Video-on-Demand server Architecture,” The 11th
International Conference on Information Networking.
- Lee-Ren Ton, Lung-Chung Chang, Min-Fu Kao, Han-Min Tseng, Shi-Sheng
Shang, Ruey-Liang Ma, Dze-Chaung Wang and Chung-Ping Chung,
“Instruction Folding in Java Processor,” Proceedings of the 1997
International Conference on Parallel and Distributed Systems,
December 10-13, 1997, pp.138-143.
- Shyh-An Chi, R-Ming Shiu, Jih-Chiang Chiu, Si-En Chang, and Chung-Ping
Chung, “Instruction Cache Prefetching with Extended BTB,” Proceedings
of the 1997 International Conference on Parallel and Distributed Systems,
December 10-13, 1997, pp. 360-365
- Neng-Pin Lu and Chung-Ping Chung, “Parallelism Exploitation
in Superscalar Multiprocessing,” Proceedings of National Computer
Symposium 1997, pp. C-82 – C-88.
C. Other Publication List
- C. P. Chung, “A VLSI Cache RISC for the C Language,” Ph.D.
dissertation, Texas A&M University, Texas, U. S. A., August 1986.
- “Study of AI Multiprocessor Systems,” National Science Council research
project report, March 1987.
- “Design of a Multiprocessor System for the C Language,” ERSO, ITRI
research project report, July 1987.
- “First Quarter of Research in System Simulation Techniques, on Continuous
System Simulation,” Sun Yet-Sien Science Institute research project
report, August 1987.
- “Second Quarter of Research in System Simulation Techniques, on
Continuous System Simulation,” Sun Yet-Sien Science Institute research
project report, November 1987.
- “Third Quarter of Research in System Simulation Techniques, on Continuous
System Simulation,” Sun Yet-Sien Science Institute research project
report, February 1988.
- “Fourth Quarter of Research in System Simulation Techniques, on
Continuous System Simulation,” Sun Yet-Sien Science Institute research
project report, May 1988.
- “Design of a Multiprocessor System for the C Language,” ERSO, ITRI
research project report, July 1988.
- “Study of AI Multiprocessor Systems,” National Science Council research
project report, August 1988.
- “Study and Design of High-Speed Computer System,” ERSO, ITRI research
project report, March 1989.
- “Research on Parallelism and Parallel Processing in Logic Programming,”
ATC, ERSO, ITRI research project report, March 1989.
- “High-Speed Computing in Continuous System Simulation,” Sun Yet-Sien
Science Institute research project report, May 1989.
- “Study and Design of High-Speed Computer Architecture,” ERSO, ITRI
research project report, August 1989.
- “Study of AI Multiprocessor Systems,” National Science Council research
project report, August 1989.
- “Study and Design of High-Speed Computer System,” ERSO, ITRI research
project report, August 1989.
- “High-Speed Computing in Continuous System Simulation,” Sun Yet-Sien
Science Institute research project report, August 1989.
- “Research on Parallelism and Parallel Processing in Logic Programming,”
ATC, ERSO, ITRI research project report, October 1989.
- “High-Speed Computing in Continuous System Simulation,” Sun Yet-Sien
Science Institute research project report, November 1989.
- “Study of High Performance Multiple Functional Unit Computer Architecture,”
Sun Yet-Sien Science Institute research project report, February 1990.
- “High-Speed Computing in Continuous System Simulation,” Sun Yet-Sien
Science Institute research project report, February 1990.
- “Study and Design of High-Speed Computer Architecture,” ERSO, ITRI
research project report, August 1990.
- “Study of AI Multiprocessor Systems,” National Science Council research
project report, August 1990.
- “Study of High Performance Multiple Functional Unit Computer Architecture,”
Sun Yet-Sien Science Institute research project report, August 1990.
- “Analysis of Vector Processing Characteristics and Study of Its
Library Routine Coding,” Sun Yet-Sien Science Institute research project
report, August 1990.
- “Study of a Fault-Tolerant, Real-Time Distributed System,” National
Science Council research project report, August 1990.
- “High-Speed Computing in Continuous System Simulation,” Sun Yet-Sien
Science Institute research project report, February 1991.
- “Research on a Superscalar and Superpipeline Based High-Performance
Computer Architecture Design,” CCL, ITRI research project report,
August 1991.
- “Study of High-Performance Multiple Functional Unit Computer Architecture,”
Sun Yet-Sien Science Institute research project report, August 1991.
- “Study of Dataflow Computer Characteristics and Their Applications
in Real System Designs,” Sun Yet-Sien Science Institute research project
report, August 1991.
- “Study of AI Multiprocessor Systems,” National Science Council research
project report, August 1991.
- “Study of Superscalar Processing Techniques,” National Science Council
research project report, February 1992.
- “A Debugging System for Concurrent Distributed Processing,” National
Science Council research project report, August 1992.
- “Study of Superscalar Processing Techniques,” National Science Council
research project report, February 1993.
- “Study of Fine-Grained Parallel Processing of Logic Programs (I),”
National Science Council research project report, August 1993.
- “Software Approach to Cache Coherence in Superscalar Multiprocessor
(I),” ATC, CCL, ITRI research project semiannual report, January 1994.
- “The Study of Memory System Design in Superscalar Processing (I),”
National Science Council research project report, February 1994.
- “Study of Superscalar Processing Techniques (II),” National Science
Council research project report, February 1994.
- “Study of Fine-Grained Parallel Processing of Logic Programs (II),”
National Science Council research project report, August 1994.
- “The Study of Memory System Design in Superscalar Processing (II),”
National Science Council research report, March 1995.
- “Design of a Multiprocessor Architecture Simulation Environment,”
ATC, CCL, ITRI research project report, Jun. 1995.
- “Study and Design of Memory System in Superscalar Multiprocessors,”
National Science Council research report, Aug. 1995.
- “Study of Parallel Processing and Performance Evaluation Environment,”
ATC, CCL, ITRI research report, Aug. 1996.
- “Study of Memory System Techniques for Clustered Multiprocessors,”
National Science Council research report, Aug. 1996.
- “Study of Superscalar Processor and Superscalar-based Multiprocessor
Code Scheduling Techniques,” National Science Council research report,
Aug. 1996.
D. Supervised Student Theses
- Ren-Liang Cheng, “The Study of Consistency Problem for Bus-Connected
Multiprocessor Systems,” master thesis, Institute of Computer Engineering,
National Chaio Tung University, Hsinchu, Taiwan, R. O. C., June 1987.
- Shi-Jiar Baw, “The VLSI Design and Implementation of LISCP, Part
I: The Alu, Tag Manipulator and Interface Circuits,” master thesis,
Institute of Electronics, National Chaio Tung University, Hsinchu,
Taiwan, R. O. C., June 1987.
- Ter-Tsung Tsai, “The VLSI Design and Implementation of LISCP, Part
II: The Register File Subsystem and Control Unit,” master thesis,
Institute of Electronics, National Chaio Tung University, Hsinchu,
Taiwan, R. O. C., June 1987.
- Chi-Yu Fu, “A Parallel Execution Model of Concurrent C on Multiprocessors,”
master thesis, Institute of Computer Engineering, National Chaio Tung
University, Hsinchu, Taiwan, R. O. C., June 1988.
- Hong-Chich Chou, “The Study and Realization of the CRISC Code Compaction
Methodology,” master thesis, Institute of Computer Engineering, National
Chaio Tung University, Hsinchu, Taiwan, R. O. C., June 1988.
- Shyi-Chyi Jeng, “Design of The Dual-ALU CRISC and Its Dual-Stream
Instruction Execution,” master thesis, Institute of Electronics, National
Chaio Tung University, Hsinchu, Taiwan, R. O. C., June 1988.
- Zeng-Chen Hwang, “Architectural Specifications of the CRISC and
Cache Support for The MCRISC,” master thesis, Institute of Electronics,
National Chaio Tung University, Hsinchu, Taiwan, R. O. C., June 1988.
- Sheng-Jen Fu, “Degenerate Corner Stitching--A Data Structuring Technique
for Interactive VLSI Layout Tools,” master thesis, Institute of Applied
Mathematics, National Chaio Tung University, Hsinchu, Taiwan, R. O.
C., June1988.
- Cheng-Chin Chiang, “Design of an Efficient OR-Parallel Execution
Model with Intelligent Backtracking for Prolog,” master thesis, Institute
of Computer Engineering, National Chaio Tung University, Hsinchu,
Taiwan, R. O. C., June 1988.
- Shih-Jay Wang, “Design of MIEP: A Multiple Inference Engines for
Prolog,” master thesis, Institute of Computer Engineering, National
Chaio Tung University, Hsinchu, Taiwan, R. O. C., June 1988.
- Tungchi Chang, “Design of LISCP-II: An Improved RISC-Style Processor
for Prolog,” master thesis, Institute of Computer Engineering, National
Chaio Tung University, Hsinchu, Taiwan, R. O. C., June 1988.
- Yung-Ming Hsu, “Study and Design of a New AND-Parallel Execution
Model for Prolog,” master thesis, Institute of Computer Engineering,
National Chaio Tung University, Hsinchu, Taiwan, R. O. C., June 1988.
- Shing-Wu Tung, “Design Considerations About a Prolog RISC Processor:
LISCP-II,” master thesis, Institute of Computer Science and Information
Engineering, National Chaio Tung University, Hsinchu, Taiwan, R. O.
C., July 1989.
- Hsi-Long Tsai, “The Study and Implementation of a Prolog Language
Processor LISCP-II,” master thesis, Institute of Computer and Information
Science, National Chaio Tung University, Hsinchu, Taiwan, R. O. C.,
July 1989.
- Ruey-Liang Maa, “Design and Consideration of a Prolog Compiler for
LISCP-II,” master thesis, Institute of Computer Science and Information
Engineering, National Chaio Tung University, Hsinchu, Taiwan, R. O.
C., July 1989.
- Wen-Lung Lin, “CSMP--A Cluster-Structured Multiprocessor System
for Continuous System Simulation,” master thesis, Institute of Computer
Science and Information Engineering, National Chaio Tung University,
Hsinchu, Taiwan, R. O. C., June 1989.
- Chin-Wei Chen, “Integrated Considerations of Real-Time and Fault-Tolerant
Requirements in Distributed Computing Systems,” master thesis, Institute
of Computer Science and Information Engineering, National Chaio Tung
University, Hsinchu, Taiwan, R. O. C., June 1989.
- Po-Chi Chen, “Task Allocation on Distributed Computing Systems—A
Simulated Annealing Approach,” master thesis, Institute of Computer
Science and Information Engineering, National Chaio Tung University,
Hsinchu, Taiwan, R. O. C., June 1989.
- Quen-Zong Wu, “ANDROR: A New AND/OR-Parallel Execution Model For
Prolog,” master thesis, Institute of Computer Science and Information
Engineering, National Chaio Tung University, Hsinchu, Taiwan, R. O.
C., June 1989.
- Chun-Hung Wen, “Design of CDFA--A Controlled Data-Flow Architecture,”
master thesis, Institute of Computer Science and Information Engineering,
National Chaio Tung University, Hsinchu, Taiwan, R. O. C., July 1989.
- Yuh-Horng Shiau, “Study of CRAY X-MP Vector Accesses Using Different
Storage Schemes,” master thesis, Institute of Computer Science and
Information Engineering, National Chaio Tung University, Hsinchu,
Taiwan, R. O. C., July 1989.
- Yeang-Ming Shih, “Dynamic Parallel Instruction Scheduling andSynchronization
in Vector Supercomputing,” master thesis, Institute of Electronics,
National Chaio Tung University, Hsinchu, Taiwan, R. O. C., July 1989.
- Der-Cherng Lee, “A Design Methodology of Vector Unit Architecture
and Dynamic Reconfigure Vector Register Design,” master thesis, Institute
of Electronics, National Chaio Tung University, Hsinchu, Taiwan, R.
O. C., July 1989.
- Yeong-Sheng Chen, “An Instrumentation Tool for Parallel Processing--Design,
Implementation and Applications,” master thesis, Institute of Computer
Science and Information Engineering, National Chiao Tung University,
Hsinchu, Taiwan, R. O. C., June 1990.
- Chin-Yao Chiang, “A Distributed Hard Real-Time Task Scheduling Based
on Criticalness or Alternative Algorithms,” master thesis, Institute
of Computer Science and Information Engineering, National Chaio Tung
University, Hsinchu, Taiwan, R. O. C., June 1990.
- Yau-Shan Chen, “Clock Synchronization in a Hypercube Distributed
System,” master thesis, Institute of Computer Science and Information
Engineering, National Chiao Tung University, Hsinchu, Taiwan, R. O.
C., June 1990.
- Che-Hsien Tsai, “The Study of Vectorization of the FFT,” master
thesis, Institute of Computer Science and Information Engineering,
National Chiao Tung University, Hsinchu, Taiwan, R. O. C., June 1990.
- Wen-Yang Lin, “The Study of Vectorization of Sorting Algorithms,”
master thesis, Institute of Computer Science and Information Engineering,
National Chiao Tung University, Hsin- chu, Taiwan, R. O. C., June
1990.
- Ruey-Lung Ou, “Realiation of OR Parallelism on a Hypercube MIEP,”
master thesis, Institute of Computer Science and Information Engineering,
National Chiao Tung University, Hsinchu, Taiwan, R. O. C., June 1990.
- Chi-Tien Yeh, “Cut and Some Other Considerations on a Hypercube
MIEP System,” master thesis, Institute of Computer Science and Information
Engineering, National Chiao Tung University, Hsinchu, Taiwan, R. O.
C., June 1990.
- Cheng-Zen Yang, “Task Assignment on CSMP for Continuous System Simulation--A
Grain Aggregating Approach,” master thesis, Institute of Computer
Science and Information Engineering, National Chiao Tung University,
Hsinchu, Taiwan, R. O. C., June 1990.
- Terry Chi, “Design of LISCP-II Prototype,” master thesis, Institute
of Computer Science and Information Engineering, National Chiao Tung
University, Hsinchu, Taiwan, R. O. C., June 1990.
- Yung-Ming Tzeng, “A Data-Driven Based Architectural Approach to
Parallel Execution of Sequential Programs,” master thesis, Institute
of Computer Science and Information Engineering, National Chiao Tung
University, Hsinchu, Taiwan, R. O. C., July 1990.
- Shyh-Ming Wang, “Design and Implementation of LISCP-II Memory System
and Its Interface to Host,” master thesis, Institute of Computer Science
and Information Engineering, National Chiao Tung University, Hsinchu,
Taiwan, R. O. C., July 1990.
- I-Kuang Chou, “A Dependency-Based Loop Partitioning Method for Multitasking
in a Vector Computer,” master thesis, Institute of Computer Science
and Information Engineering, National Chiao Tung University, Hsinchu,
Taiwan, R. O. C., June 1990.
- Chiou-Luenn Lin, “Analysis of Availability and Reliability of AT&T
No. 5ESS,” master thesis, Institute of Computer Science and Information
Engineering, National Chiao Tung University, Hsinchu, Taiwan, R. O.
C., June 1991.
- Yuan-Kai Chen, “A Hardware Approach to Parallel Instruction Decoding
and Issuing,” master thesis, Institute of Computer Science and Information
Engineering, National Chiao Tung University, Hsinchu, Taiwan, R. O.
C., July 1991.
- Kuo-Kuang Teng, “Design and Performance Evaluation of a Cluster-Structured
Multiprocessor System for Continuous System Simulation,” master thesis,
Institute of Computer Science and Information Engineering, National
Chiao Tung University, Hsinchu, Taiwan, R. O. C., July 1991.
- Yen-Yuan Chiang, “Integration of AT Bus and Multibus in a Single-Board
Computer,” master thesis, Institute of Computer Science and Information
Engineering, National Chiao Tung University, Hsinchu, Taiwan, R. O.
C., July 1991.
- Kun-Chen Wu, “Design and Implementation of a Prolog Processor LISCP-II,”
master thesis, Institute of Computer Science and Information Engineering,
National Chiao Tung University, Hsinchu, Taiwan, R. O. C., July 1991.
- Neng-Pin Lu, “A Fault-Tolerant Multistage Combining Network,” master
thesis, Institute of Computer Science and Information Engineering,
National Chiao Tung University, Hsinchu, Taiwan, R. O. C., July 1991.
- Chi-Der Von, “A Study of Multi-Operation Instruction Set Architecture,”
master thesis, Institute of Computer Science and Information Engineering,
National Chiao Tung University, Hsinchu, Taiwan, R. O. C., July 1991.
- Hong-Chich Chou, “Study of Superscalar Instruction Scheduling,”
Ph.D. dissertation, Institute of Computer Science and Information
Engineering, National Chiao Tung University, Hsinchu, Taiwan, R. O.
C., October 1992.
- Yuh-Horng Shiau, “Study of Superscalar Processing,” Ph.D. dissertation,
Institute of Computer Science and Information Engineering, National
Chiao Tung University, Hsinchu, Taiwan, R. O. C., September 1993.
- Ren-Liang Cheng, “The Approximate Agreement of Massively Parallel
Systems,” Ph.D. dissertation, Institute of Computer Science and Information
Engineering, National Chiao Tung University, Hsinchu, Taiwan, R. O.
C., January 1994.
- Tang-Show Hwang, “Delayed Invalidation -- A Software-Based Cache
Coherence Scheme,” master thesis, Institute of Computer Science and
Information Engineering, National Chiao Tung University, Hsinchu,
Taiwan, R. O. C., June 1994.
- Ching-Wei Chen, “Time Interval-Based Coloring Approach to Register
Allocation,” master thesis, Institute of Computer Science and Information
Engineering, National Chiao Tung University, Hsinchu, Taiwan, R. O.
C., June 1994.
- Ding-Long Liu, “PUMTS: A Parallel Unification Multi-Threaded Superscalar
Processor for Prolog,” master thesis, Institute of Computer Science
and Information Engineering, National Chiao Tung University, Hsinchu,
Taiwan, R. O. C., June 1994.
- Chung-Cheng Feng, “A Study of Prolog AND Parallel Execution Model
on Multi-Threaded Superscalar System,” master thesis, Institute of
Computer Science and Information Engineering, National Chiao Tung
University, Hsinchu, Taiwan, R. O. C., June 1995.
- Chuan-Cheng Hsu, “Prolog OR Parallel Execution Model on Multi-Threaded
Superscalar Processor,” master thesis, Institute of Computer Science
and Information Engineering, National Chiao Tung University, Hsinchu,
Taiwan, R. O. C., June 1995.
- Chang-Chung Liu, “A Study of Register Renaming in x86 Superscalar
Processor,” master thesis, Institute of Computer Science and Information
Engineering, National Chiao Tung University, Hsinchu, Taiwan, R. O.
C., June 1995.
- Cheng-Shon Kuo, “Design of Instruction Queue for x86 Superscalar
Processor with Branch Prediction,” master thesis, Institute of Computer
Science and Information Engineering, National Chiao Tung University,
Hsinchu, Taiwan, R. O. C., June 1995.
- Ruey-Liang Ma, “A Study of Fine-Grained Parallel Processing for
Logic Programs,” Ph.D. dissertation, Institute of Computer Science
and Information Engineering, National Chiao Tung University, Hsinchu,
Taiwan, R. O. C., June 1995.
- Jih-Shiung Gau, “The Study of Branch Prediction Strategies Used
by High Performance Processors,” master thesis, Institute of Computer
Science and Information Engineering, National Chiao Tung University,
Hsin Chu, Taiwan, R. O. C., June 1996.
- Shei-Sin Ju, “A Study of Parallelization in Multi-Processor System
Simulation,” master thesis, Institute of Computer Science and Information
Engineering, National Chiao Tung University, Hsin Chu, Taiwan, R.
O. C., June 1996.
- Che-Sheng Cheng, “Performance Evaluation of Various Types of Decoded
Instruction Caches for x86 Superscalar Processor” master thesis, Institute
of Computer Science and Information Engineering, National Chiao Tung
University, HsinChu, Taiwan, R. O. C., June 1996.
- Shyh-An Chi, “Instruction Cache Prefetching Based on An Extended
BTB,” master thesis, Institute of Computer Science and Information
Engineering, National Chiao Tung University, HsinChu, Taiwan, R. O.
C., June 1997.
- Jieh-Nan Yang, “Instruction Fetcher Design for an X86 Superscalar
Processor,” master thesis, Institute of Computer Science and Information
Engineering, National Chiao Tung University, HsinChu, Taiwan, R. O.
C., June 1997.
- Han-Min Tseng, “ILP analysis on the Java machine,” master thesis,
Institute of Computer Science and Information Engineering, National
Chiao Tung University, HsinChu, Taiwan, R. O. C., June 1997.
Research
and Project Abstracts
計畫時間:
85.6 – 88.5
計畫名稱:
前瞻性微處理機設計與製造之子計畫八:指令解讀安排單元及資料存取單元
Design and Implementation of an Advanced Microprocessor Sub-project
8: Instruction Decode/Schedule Unit and Data Load/Store Unit
支援單位:
國科會
本計畫的目標,在設計合乎x86指令集特色的指令擷取解讀單元及資料存取單元。指令擷取解讀單元的研究項目包括指令的擷取、解碼及分配;而資料存取單元的研究項目則包括各種定址模式的計算、保護及資料存取的平行化。指令擷取解碼及資料存取的速度,向來為微處理機效率的瓶頸。x86指令集的格式、語意及定址模式均異常複雜,更加重了這些瓶頸對效能的影響。現今之高效能x86指令集處理機,皆用種種特殊的設計來解決這些瓶頸。我們將評估這些設計對效能的影響,並提出新的設計以提供更高的解碼及資料存取頻寬,以配合多重執行單元的運算能力。本計畫為三年之研究發展計畫中的第二年。在第一年的研究中,我們提出了各單元的運作模型設計,包括指令預先解碼器、指令擷取器、指令解碼器、指令分配器、資料存取單元及資料位址產生單元。並以軟體模擬評估重要實作方案的可行性、複雜度、硬體成本及效能,來確定重要的設計決策。並訂定了各單元RTL層次的介面及設計。本年度的研究重點,在延續第一年的研究成果,將設計落實至邏輯閘層次;於總計畫的整合下驗證各單元RTL層次的設計;並更精確的評估實作上的時脈限制、成本限制,回饋改進原來的設計。第三年的研究方向,在以前兩年的研究為基礎,參與實作驗證的整合,以實作的限制及效果,從事系統參數調整甚至結構修正,並參與技術轉移。
The objective of this subproject is to design an instruction
fetch/decode unit and a data load/store unit which are compatible with
x86 instruction set. The functions of the instruction fetch/decode unit
include instruction fetching, instruction decoding, and instruction
dispatching. The functions of the data load/store unit include data
address computing/protecting, and parallel data access. The instruction
fetch/decode rates and data access rates are always bottlenecks for
the performance of ILP microprocessors. The complex formats, semantics
and addressing modes of X86 instruction set increase the influence of
these bottlenecks. We will design an instruction fetch/decode unit and
a data load/store unit with high bandwidth to match the execution rate
of multiple execution units. This is the second year project of the
three years project. In the first year, we have studied and designed
the instruction fetch/decode unit and the data load/store unit. The
blocks we designed including the instruction predecoder, the instruction
fetcher, the instruction decoder, the instruction dispatching, the data
access unit, and the data address generator A primitive software simulation
was built to evaluate the effect of every solution. Furthermore, we
defined the interface with other units, and refine our designs to the
RTL level. In this year, we will refine our designs to logic gate level,
with considerations of the implementation restrictions. In the third
year, we will push the previous designs for the sake of chip implementation
and system integration.
計畫時間:
85.8 – 88.7
計畫名稱:
單晶片多處理機設計之研究
Study of Single Chip Multiprocessor Design
支援單位:
國科會
隨著超大型積體電路技術之進步,單晶片中可包含之硬體資源大幅增加。過去的研究者在單指令引線中發掘指令間之平行性以善用硬體資源獲取效益,而發展出如超純量及超長指令計算機之設計。然而隨著積體電路技術之進步,單引線中有限之指令間平行性已不足以完全發揮半導體技術所提供之大量硬體資源,我們必須思考新的設計方向,以善用硬體資源提昇系統效能。
本計畫嘗試著在半導體技術足以支援之前提下,研究在單晶片上如何建構多個處理單元,並同時執行多個引線之程式。藉著將多個處理單元更緊密的結合在單一晶片上,我們除可在過去之基礎上繼續發掘單引線內指令之平行性外,更可開發單晶片上引線間之平行性;同時我們亦可以此多處理機晶片為基礎延伸而成大量平行性架構。我們將分別以處理單元間之連接、控制機構之設計,處理單元間之工作分配、排程與程式碼之產生,記憶體系統之設計,記憶體效能評估等議題研究多處理機晶片架構設計及以此為核心之相關軟硬體支援,評估其系統整體效能並以FPGA方式實作驗證。
我們規畫此三年計畫以達成上述目標。在第一年中,我們預定完成排程問題理論模型、晶片上快取記憶體標籤陣列與目錄之設計方案、以及以FPGA展示板實現晶片上連結網路。自下年度起,我們將這個計畫規劃為以下三個子計畫,分別進行其重點研究,並整合之以達成整體研究目標:
一、標竿程式分析、測試與效能評估
二、處理機晶片內記憶體結構之設計
三、單晶片多處理機可程式實驗平台之設計及實現
各個子計畫之研究重點如下:子計畫一規畫多處理支援機構,發展工作排程之技術,建立完整單晶片多處理機之編譯器環境。子計畫二將進行晶片內快取記憶體設計:規畫晶片內快取記憶體結構,擴展可延伸式記憶體階層設計觀念,實現多處理支援機構。子計畫三則將以FPGA實現連結網路,建立實驗平台雛形,並撰寫軟硬體介面程式,實現PC至網路實驗平台之連結。我們期望藉此整合型計畫能對單晶片多處理機之各項相關軟、硬體架構設計有一完整深入之探討,希望我們整合的研究成果能成為新一代計算機系統之設計觀念,為學術界與工業界所採用。
Because of the progress in VLSI technology, more and
more hardware resources can be built on a single silicon chip. Researchers
have developed architectures such as superscalar and VLIW to exploit
instruction level parallelism within a single instruction stream to
benefit from this progress. However, the limited parallelism in a single
instruction stream can not keep up with the progress of resource availability
as the VLSI technology moves further forward. Computer architects are
challenged to exploit much more parallelism to benefit from the ever
increasing hardware resources on a single chip.
We try to implement more than one processing element
embedded on a single chip, and to execute more than one instruction
stream concurrently provided that the circuit density will allow us
to do so. By making more than one processing element tightly coupled
on a single chip, the communication latency can be reduced and mechanisms
can be designed to exploit thread level parallelism as well as instruction
level parallelism. Moreover, we will design mechanisms that makes the
multi-processor chip easy to be used for constructing massively parallel
machines. Issues in our approach to the multiprocessor chip design and
related supporting in both hardware and software include: the interconnection
and control of processing elements, task assignment and scheduling on
processing elements, object code generation, memory system design and
performance evaluation. Finally, we will evaluate the overvall system
performance and implement it using FPGA.
To achieve this goal, we proposed the 3-year-long
project to investigate above issues. This is the 2nd year in our research.
The research results in the first year include: a theoretical model
for task assignment, a design scheme for on-chip cache directory and
tag-array, and an FPGA demo board that implements the on-chip interconnection
network. From the 2nd year, we organize this joint project to include
three projects, each having its specific research themes; and to integrate
these projects such that the overall system design and performance picture
can be conceived. These projects are:
1.Benchmarking and Performance Evaluation.
2.Design of Scalable Memory Architecture on CPU
Chip.
3.Design and Implementation of a Programmable Platform
for a Single Chip with
Multiple CPUs.
The research topics of the three subprojects are outlined
as follows. Subproject 1 will design mutiprocessing supporting mechanisms,
develop the techniques of task scheduling, and consturct the compiler
environment for the on-chip multiprocessor. Subproject 2 will design
the on-chip caches, extend the design concept of scalable memory hierarchy,
and implement the multiprocessing supporting mechanisms. Subproject
3 will implement the on-chip interconnection network with FPGA, construct
the FPGA experiment platform, develop the interface programs to connects
PCs and the interconnection network protype, and carry out the interconnection
between PCs via the protype. With the integrated project, we hope we
can propose the related software and hardware architectures for the
single-chip multiprocessor. Furthermore, we hope the proposed solutions
will be the design concept of next-generation computer systems and be
adopted by the academy and industry.
計畫時間:
85.8 – 88.7
計畫名稱:
標竿程式分析.測試與效能評估
Benchmarking and Performance Evaluation
支援單位:
國科會
藉由將多個處理單元建構於單晶片上,我們可降低處理單元間之通訊延遲;並可建構通訊法則,以提昇處理單元使用率。此多處理機晶片可被更進一步的用以建構階層式多處理機系統,不同階層間具有不同之通訊網路,使得此多處理機系統具有更佳之延展性。為了發揮架構特性以提昇效能,我們將針對其階層特性,開發工作排程技術,並開發多處理支援機構,以進一步降低程式執行時間。
我們並將建構編譯與模擬環境以評估我們所發展之技術。我們選擇Stanford
SUIF作為編譯器之基礎,並做加強與改進,與我們自行發展之排程技術相結合,以編譯標竿程式。我們所選用的標竿程式將同時包含一般應用程式與多處理機系統之平行程式。最後,我們將進行對標竿程式之模擬以評估系統效能。
本計畫為期三年。在第一年中,我們已建構一模擬環境,架設基本SUIF模組,並建立一靜態排程模型。在往後二年中,我們將持續進行靜態/動態排程之研究,並設計如同步、資料預先擷取等多處理支援機構。
By embedding more than one processing elements in
a single chip, communication latency can be reduced and mechanisms can
be constructed to improve the processor utilization. Moreover, the MP-chip
can be used as the basis to construct heirarchical multiprocessor systems
with different communication latency in different layers. To benefit
from these features , we will develope task scheduling techniques that
take account of the hierarchical structure and design on-chip multiprocessing
supporting mechanisms to further reduce the program execution time.
A simulation and compilation environment will be build
to evaluate the techniques we developed. The Stanford SUIF system has
been choosen as the basis of our compilation environment.We will perform
further enhancement on SUIF and integrate with the task scheduling techniques
we developed to compile the benchmark programs. The set of benchmark
programs contains general purpose applications as well as parallel programs
for multi-processor systems. Finally, a simulation with the benchmark
programs will be given to evaluate the system performance.
This is the 2nd year in our research. In the first
year, we have constructed a simulation environment, ported SUIF, and
developed a theoretical model for static task scheduling. In the 2nd
and 3rd year, we will keep on developing static/dynamic scheduling techniques
for hierarchical systems and design multiprocessing supporting mechanisms
such as data prefetch and synchronization primitives.
計畫時間:86.7
– 87.6
計畫名稱:Java處理器之多媒體技術研究
Study of Multimedia Technologies in a Java Processor
支援單位:工研院電通所
本Java處理器之多媒體技術研究計畫將承續上年度改良式Java處理器的研究成果,擬以一年時間,針對Java處理器在日益普遍的多媒體應用環境中執行多媒體指令的效能進行改良之研究。主要研究之問題為Stack-based架構的Java處理器與SIMD的多媒體計算核心間各種平行執行與搭配的組合。
在今年度的計畫中,本研究小組將考慮兩個將Java處理器核心與SIMD的多媒體計算核心融合在一個處理器中的不同研究方向。第一個方向是試著把多媒體暫存器與運算元堆疊合併,並提供高頻寬的堆疊/多媒體暫存器,但必須同時兼顧堆疊存取指標與多媒體資料非循序存取之特性,並分析共用堆疊/多媒體暫存器的優缺點;第二個方向就是採用分離式的設計,但必須另外再加上多媒體暫存器與資料快取記憶體,以及多媒體暫存器到運算元堆疊之間的傳輸界面,以分析Java處理器核心與多媒體計算核心之平行執行、同步、以及多重引線等組合運作模式的可行性。最後希望提出一個SIMD多媒體計算核心與多媒體結構化暫存器的架構在Java處理器中的良好設計,並取得工業的標準多媒體測試程式以驗證其執行效能。
In this one-year project, we will base our study on
the research results of the enhanced Java processor in the previous
research project of ours to further enhance its execution performance
in modern popular multimedia application environment. The main research
direction is to analyze and try to design the parallel architecture
combinations between a stack-based Java processor core and an SIMD multimedia
computation core.
In this project, our research team will consider two
schemes to combine a Java processor core and a multimedia computation
core into one processor chip. The first is to combine the multimedia
registers into operand stack of a Java processor, and try to support
high bandwidth multimedia data access and flexible stack pointer maintenance.
The second is to adopt separate design for operand stack and multimedia
registers. This scheme requires extra interfaces from multimedia registers
to data memory, and from multimedia registers to operand stack. Besides,
we will evaluate the advantages and disadvantages of the shared and
separate operand stack and multimedia registers design, respectively.
This includes the possibilities of parallel execution, synchronization,
and multithreading. Finally, we will propose the well designed architecture
of the Java processor for the SIMD multimedia computation core and the
corresponding structured multimedia register and simulate their performance
by the industrial standard multimedia benchmarks.
|