<?xml version="1.0" encoding="iso-8859-1" ?>
<rss version="2.0">
<channel>
<title><![CDATA[Recent Microprocessors  White Papers, Webcasts and Case Studies - ZDNet]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/Desktops%2C+Laptops+and+OS/Components/Microprocessors/]]></link>
<description><![CDATA[Recent Microprocessors  White Papers, Webcasts and Case Studies - ZDNet]]></description>
<language>en-us</language>
<item>
<title><![CDATA[Energy-Aware Microprocessor Synchronization: Transactional Memory Vs. Locks]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=396069]]></link>
<description><![CDATA[One important way in which multiprocessors differ from uniprocessors is in the need to provide programmers the ability to synchronize concurrent access to memory. Transactional memory was proposed as a way of improving throughput especially when the rate of synchronization conflict is low. This paper explores power implications of transactional memory on standard and synthetic benchmarks. They propose a new "Serial execution" mode that lowers energy consumption during high contention periods by reducing transaction throughput. They conclude that transactional models are a promising approach to low-power synchronization, and serial execution strengthens the energy advantage, but that further work is needed to fully understand how transactions compare to locks at high levels of contention.]]></description>
<pubDate>Tue, 25 Nov 2008 04:34:44 -0800</pubDate>
</item>
<item>
<title><![CDATA[Ultra Low-Cost Defect Protection for Microprocessor Pipelines]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=396065]]></link>
<description><![CDATA[The sustained push toward smaller and smaller technology sizes has reached a point where device reliability has moved to the forefront of concerns for next-generation designs. Silicon failure mechanisms, such as transistor wearout and manufacturing defects, are a growing challenge that threatens the yield and product lifetime of future systems. This paper introduces the BulletProof pipeline, the first ultra low-cost mechanism to protect a microprocessor pipeline and on-chip memory system from silicon defects. To achieve this goal they combine area-frugal on-line testing techniques and system-level checkpointing to provide the same guarantees of reliability found in traditional solutions, but at much lower cost.]]></description>
<pubDate>Tue, 25 Nov 2008 04:32:16 -0800</pubDate>
</item>
<item>
<title><![CDATA[Multiple Instruction Stream Processor]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=396021]]></link>
<description><![CDATA[Microprocessor design is undergoing a major paradigm shift towards multi-core designs, in anticipation that future performance gains will come from exploiting thread-level parallelism in the software. To support this trend, the paper presents a novel processor architecture called the Multiple Instruction Stream Processing (MISP) architecture. MISP introduces the sequencer as a new category of architectural resource, and defines a canonical set of instructions to support user-level inter-sequencer signaling and asynchronous control transfer. MISP allows an application program to directly manage user-level threads without OS intervention. By supporting the classic cache-coherent shared-memory programming model, MISP does not require a radical shift in the multithreaded programming paradigm.]]></description>
<pubDate>Tue, 25 Nov 2008 04:07:15 -0800</pubDate>
</item>
<item>
<title><![CDATA[Processor Power Management Features and Process Scheduler: Do We Need to Tie Them Together?]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=395981]]></link>
<description><![CDATA[Power savings is a key focus area in today's microprocessors, with almost all latest microprocessors providing wide variety of power saving features. Processor P-state is the capability of running the processor at different voltage and/or frequency levels. Processor C-state is the processor capability to go into various low power idle states (with varying wakeup latency). Linux kernel policies like cpufreq-ondemand governor and cpuidle-menu governor make effective use of these processor power management features, giving power savings to the end user. This paper looks at various inter-relations between Linux power management features and process scheduler. In particular, it covers various issues and mechanisms for incorporating power management related information in process scheduler.]]></description>
<pubDate>Tue, 25 Nov 2008 03:44:52 -0800</pubDate>
</item>
<item>
<title><![CDATA[A Power-Aware Run-Time System for High-Performance Computing]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=392670]]></link>
<description><![CDATA[The High-Performance Computing (HPC) community has focused on performance, where performance is defined as speed. To achieve better performance per compute node, microprocessor vendors have not only doubled the number of transistors (and speed) every 18-24 months, but they have also doubled the power densities. Consequently, keeping a large-scale HPC system functioning properly requires continual cooling in a large machine room, thus resulting in substantial operational costs. Furthermore, the increase in power densities has led to a decrease in system reliability, thus leading to lost productivity. To address these problems, this paper proposes a power-aware algorithm that automatically and transparently adapts its voltage and frequency settings to achieve significant power reduction and energy savings with minimal impact on performance.]]></description>
<pubDate>Sun, 02 Nov 2008 22:44:23 -0800</pubDate>
</item>
<item>
<title><![CDATA[Design and Implementation of a High-Performance Microprocessor Cache Compression Algorithm]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=391182]]></link>
<description><![CDATA[Researchers have proposed using hardware data compression units within the memory hierarchies of microprocessors in order to improve performance, energy efficiency, and functionality. However, most past work, and in particular work on cache compression, has made unsubstantiated assumptions about the performance, power consumption, and area overheads of the required compression hardware. This paper presents a lossless compression algorithm that has been designed for on-line memory hierarchy compression, and cache compression in particular. They reduced the algorithm to a register transfer level hardware implementation, permitting performance, power consumption, and area estimation.]]></description>
<pubDate>Tue, 21 Oct 2008 10:15:57 -0700</pubDate>
</item>
<item>
<title><![CDATA[Warp Processors]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=391056]]></link>
<description><![CDATA[This paper describes a new processing architecture, known as a warp processor that utilizes a Field-Programmable Gate Array (FPGA) to improve the speed and energy consumption of software binary executing on a microprocessor. Unlike previous approaches that also improve software using an FPGA but do so using a special compiler, a warp processor achieves these improvements completely transparently and operates from a standard binary. A warp processor dynamically detects the binary's critical regions, re-implements those regions as a custom hardware circuit in the FPGA, and replaces the software region by a call to the new hardware implementation of that region.]]></description>
<pubDate>Tue, 21 Oct 2008 08:10:53 -0700</pubDate>
</item>
<item>
<title><![CDATA[Design of a Data Recovery Block for Communications Over Power Distribution Networks of Microprocessors]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=388594]]></link>
<description><![CDATA[This paper proposed the use of Power Distribution Network (PDN) of a microprocessor for ubiquitous access of internal nodes for test/debug and showed the suitability of impulse Ultra-WideBand (UWB) communications for the purpose. This paper presents design of a data recovery block to recover data from UWB impulses superposed on a power line of a microprocessor. Considerations for data recovery block design based upon measured PDN characteristics have been discussed. The proposed circuit was implemented in TSMC 0.18 um CMOS process, and simulations show that it consumes 4.42 mW when operating from a 1.8V supply and at a pulse repetition rate of 200 MHz.]]></description>
<pubDate>Mon, 06 Oct 2008 02:06:10 -0700</pubDate>
</item>
<item>
<title><![CDATA[High-Bandwidth Address Translation for Multiple-Issue Processors]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=375584]]></link>
<description><![CDATA[In an effort to push the envelope of system performance, microprocessor designs are continually exploiting higher levels of instruction-level parallelism, resulting in increasing bandwidth demands on the address translation mechanism. Most current microprocessor designs meet this demand with a multi-ported TLB. While this design provides an excellent hit rate at each port, its access latency and area grow very quickly as the number of ports is increased. As bandwidth demands continue to increase, multi-ported designs will soon impact memory access latency. This paper presents four high-bandwidth address translation mechanisms with latency and area characteristics that scale better than a multi-ported TLB design.]]></description>
<pubDate>Thu, 10 Jul 2008 07:13:08 -0700</pubDate>
</item>
<item>
<title><![CDATA[Scalability of 3D-Integrated Arithmetic Units in High-Performance Microprocessors]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=375418]]></link>
<description><![CDATA[Three-Dimensional integration provides a simultaneous improvement in wire-related delay and power consumption of microprocessor circuits. Prior work has looked at the performance, power, and area benefits of the 3D integration technology. This paper investigates the scalability issues of 3D die-stacked arithmetic units. It explores the behavior of the 3D-integrated arithmetic circuits with increasing issue-width (parallel execution capability), transistor sizing, and temperature. The paper shows that the 3D-integrated units have a lower latency degradation and lower rate of increase in energy consumption than the planar circuits with increasing issue-widths and operating temperatures. It demonstrates that the 3D-integrated circuits have less sensitivity to transistor sizing than the planar circuits. The better scalability of 3D circuits may extend the silicon roadmap for a few more generations.]]></description>
<pubDate>Thu, 10 Jul 2008 05:47:31 -0700</pubDate>
</item>
<item>
<title><![CDATA[Set processor affinity programmatically in a multi-core system]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=362540]]></link>
<description><![CDATA[Multi-core CPU personal computers are now commonplace and application developers should be aware of them and consider the potential more than one CPU can have on the performance of their applications. There are a few different ways to set processor affinity and Edmond Woychowsky shows you two programmatic ways he has used in his projects.

This download is also available as an entry in the TechRepublic Programming and Development blog.]]></description>
<pubDate>Thu, 19 Jun 2008 12:27:37 -0700</pubDate>
</item>
<item>
<title><![CDATA[Keeping City Finances on Track With Sun]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=357554]]></link>
<description><![CDATA[Settled by Ohlone Indians in 1200 B.C. and incorporated as a city in 1852, Oakland has garnered praise from the likes of Forbes magazine, which ranked it as the tenth best city for business in the U.S. in 2001. Running Oracle Financials on a DEC Alpha 8400 server, the City experienced slow processing times, inaccurate payroll data and diminishing productivity because it couldn't run other financial applications concurrently with the payroll program. Later, the City of Oakland made the decision to migrate its applications and data to a more robust, scalable platform. The City worked closely with Sun Services to migrate its financial applications and data to a Sun Enterprise 10000 server powered by UltraSPARC microprocessors and running the Solaris 8 Operating Environment.]]></description>
<pubDate>Tue, 20 May 2008 08:27:20 -0700</pubDate>
</item>
<item>
<title><![CDATA[SAS 9.1.3 SP4 on IBM System p 570 POWER6 Processor-Based Server]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=350939]]></link>
<description><![CDATA[IBM System p 570 is powered by the 64-bit, 4.7 GHz, IBM POWER6 processors. The POWER6 processor is the next generation of the IBM POWER5 family of microprocessors. The IBM System p 570 server with IBM AIX Version 5.3 operating system demonstrated impressive performance, scalability and reliability when running the SAS 9 (classic SAS) application in a test environment performed by IBM and SAS.]]></description>
<pubDate>Tue, 08 Apr 2008 10:48:42 -0700</pubDate>
</item>
<item>
<title><![CDATA[An Assessment of Leadership Performance With POWER6 Processors and Red Hat Enterprise Linux 5.1]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=350884]]></link>
<description><![CDATA[Compute-intensive performance is increasingly required for today's high-performance environments. The IBM System p 570 with POWER6 processors provides leadership performance and demonstrates excellent scalability moving from one node to four nodes in this environment while providing linear SMP (Symmetric Multiprocessing) scaling and growth for workloads similar to the metrics used here. This paper highlights the exceptional performance on IBM's POWER6 processor-based systems running with the latest Red Hat Enterprise Linux (RHEL) 5.1 operating system, based on recently audited published SPEC CPU2006 and SPECjbb2005 results, including single-system LINPACK metrics on System p 570 4-core, 8-core and 16-core systems. With Simultaneous Multithreading (SMT) support turned on, Linux is easily able to provide effective scheduling support of the 32 processor threads seen on the 16-core 570 system.]]></description>
<pubDate>Tue, 08 Apr 2008 09:50:55 -0700</pubDate>
</item>
<item>
<title><![CDATA[IBM POWER6 Microprocessor Physical Design and Design Methodology]]></title>
<link><![CDATA[http://whitepapers.zdnet.com/abstract.aspx?docid=350838]]></link>
<description><![CDATA[The IBM POWER6e microprocessor is a 790 million-transistor chip that runs at a clock frequency of greater than 4 GHz. The complexity and size of the POWER6 microprocessor, together with its high operating frequency, present a number of significant challenges. This paper describes the physical design and design methodology of the POWER6 processor. Emphasis is placed on aspects of the design methodology, technology, clock distribution, integration, chip analysis, power and performance, Random Logic Macro (RLM), and design data management processes that enabled the design to be completed and the project goals to be met.]]></description>
<pubDate>Tue, 08 Apr 2008 09:02:53 -0700</pubDate>
</item>
</channel>
</rss>
