La descarga está en progreso. Por favor, espere

La descarga está en progreso. Por favor, espere

Metodología de programación paralela Intel Software College.

Presentaciones similares


Presentación del tema: "Metodología de programación paralela Intel Software College."— Transcripción de la presentación:

1 Metodología de programación paralela Intel Software College

2 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 2 Threaded Programming Methodology Objetivos Al final de este módulo Será capaz de realizar un prototipo y estimar el esfuerzo requerido para paralelizar regiones que consumen tiempo

3 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 3 Threaded Programming Methodology Agenda Un ciclo de desarrollo genérico Caso de estudio: Generación de números primos Algunos problemas de rendimiento comunes

4 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 4 Threaded Programming Methodology ¿Qué es paralelismo? Dos o más procesos o hilos se ejecutan al mismo tiempo Paralelismo para arquitecturas con varios núcleos Múltiples procesos Comunicación a través de IPCs (Inter-Process Communication) Un solo proceso, múltiples hilos Comunicación a través de memoria compartida

5 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 5 Threaded Programming Methodology n = número de procesadores T paralelo = { (1-P) + P/n } T serial Aceleración = T serial / T paralelo Ley de Amdahl Describe el límite máximo de aceleración con ejecución paralela El código serial limita la aceleración (1-P) P T serial (1-P) P/ /0.75 = 1.33 n = 2 n = n = P/ … /0.5 = 2.0

6 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 6 Threaded Programming Methodology Procesos e Hilos Los sistemas operativos modernos cargan programas como procesos Tiene recursos Ejecución Un proceso inicia ejecutando su punto de entrada como un hilo Los hilos pueden crear otros hilos dentro del proceso Cada hilo obtiene su propio stack Todos los hilos dentro de un proceso comparten código y segmentos de datos Procesos e Hilos Code segment Data segment thread main() … thread Stack

7 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 7 Threaded Programming Methodology Hilos – Beneficios y riesgos Beneficios Mayor rendimiento y mejor utilización de recursos Incluso en sistemas con un solo procesador – para esconder latencia e incrementar el tiempo de respuesta Comunicación entre procesos a través de memoria compartida es más eficiente Riesgos Incrementa la complejidad de la aplicación Difícil de depurar (condiciones de concurso, interbloqueos, etc.)

8 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 8 Threaded Programming Methodology Preguntas comunes cuando se paralelizan aplicaciones ¿Dónde paralelizar? ¿Cuánto tiempo lleva paralelizar? ¿Cuánto esfuerzo para rediseñar se requiere? ¿Es útil paralelizar una región específica? ¿Qué tanto se espera acelerar? ¿El rendimiento va de acuerdo a mis expectativas? ¿Será escalable a más hilos/datos añadidos? ¿Qué modelo de paralelización utilizar?

9 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 9 Threaded Programming Methodology Generación de números primos bool TestForPrime(int val) { // lets start checking from 3 int limit, factor = 3; limit = (long)(sqrtf((float)val)+0.5f); while( (factor <= limit) && (val % factor) ) factor ++; return (factor > limit); } void FindPrimes(int start, int end) { int range = end - start + 1; for( int i = start; i <= end; i += 2 ) { if( TestForPrime(i) ) globalPrimes[gPrimesFound++] = i; ShowProgress(i, range); } ifactor

10 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 10 Threaded Programming Methodology Actividad 1 Ejecutar la versión serial de los números primos Buscar el directorio PrimeSingle Compilar con el compilador de Intel Ejecutar algunas veces con rangos diferentes

11 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 11 Threaded Programming Methodology Metodología de desarrollo Análisis Buscar código donde se realiza cómputo intensivo Diseño (Introducir Hilos) Determinar como implementar una solución paralelizada Depurar Detectar cualquier problema como resultado de usar hilos Afinar para mejorar el rendimiento Lograr el mejor rendimiento en paralelo

12 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 12 Threaded Programming Methodology Ciclo de desarrollo Análisis –VTune Performance Analyzer Diseño (Introducir Hilos) –Intel® Performance libraries: IPP y MKL –OpenMP* (Intel® Compiler) –Creación explícita de hilos (Win32*, Pthreads*) Depuración –Intel® Thread Checker –Intel Debugger Afinar para mejorar el rendimiento –Intel® Thread Profiler –VTune Performance Analyzer

13 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 13 Threaded Programming Methodology Usaremos el proyecto PrimeSingle para el análisis PrimeSingle Uso:./PrimeSingle Analisis - Muestreo Usar el muestreo con VTune Sampling para encontrar hotspots en la aplicación bool TestForPrime(int val) { // lets start checking from 3 int limit, factor = 3; limit = (long)(sqrtf((float)val)+0.5f); while( (factor <= limit) && (val % factor)) factor ++; return (factor > limit); } void FindPrimes(int start, int end) { // start is always odd int range = end - start + 1; for( int i = start; i <= end; i+= 2 ){ if( TestForPrime(i) ) globalPrimes[gPrimesFound++] = i; ShowProgress(i, range); } Identifica las regiones que consumen tiempo

14 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 14 Threaded Programming Methodology Análisis – Gráfico de llamadas a funciones Este es el nivel en el árbol de llamadas donde necesitamos paralelizar Usado para encontrar el nivel adecuadoen el árbol de llamadas para paralelizar

15 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 15 Threaded Programming Methodology Análisis ¿Dónde paralelizar? FindPrimes() ¿Vale la pena paralelizar una región seleccionada? Parece que tiene un mínimo de dependencias Aparenta ser paralelo en los datos Consume sobre el 95% del tiempo de ejecución Medición base

16 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 16 Threaded Programming Methodology Actividad 2 Ejecuta el código con el rango de para obtener la medición base Tomar nota para referencias futuras Ejecutar la herramienta de análisis VTune en el código serial ¿Qué función se lleva la mayor parte del tiempo?

17 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 17 Threaded Programming Methodology Metodología de diseño de Foster De Designing and Building Parallel Programs por Ian Foster Cuatro pasos: ParticionarParticionar Dividir cómputo y datos ComunicaciónComunicación Intercambio de datos entre cómputos AglomeraciónAglomeración Agrupar tareas para mejorar rendimiento MapeoMapeo Asignar tareas a procesadores/hilos

18 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 18 Threaded Programming Methodology Diseñando programas paralelos Particionar Divide el problema en tareasComunicar Determina la cantidad y el patrón de comunicaciónAglomerar Combinar tareasMapear Asignar tareas aglomeradas a los hilos generados Problema Tareas inicialesComunicaciónTareas combinadas Programa final

19 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 19 Threaded Programming Methodology Modelos de programación paralelos Descomposición funcional Paralelismo de tareas Dividir el cómputo, asociarle datos Tareas independientes del mismo problema Descomposición de datos La misma operación ejecutando diferentes datos Dividir datos en piezas, asociarles cómputo

20 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 20 Threaded Programming Methodology Métodos de descomposición Descomposición funcional Enfocarse a cómputo puede revelar la estructura en un problema Grid reprinted with permission of Dr. Phu V. Luong, Coastal and Hydraulics Laboratory, ERDC Descomposición por dominio Enfocarse en la estructura de datos más grande o más frecuentemente accesada Paralelismo en los datos La misma operación aplicada a todos los datos Modelo atmosférico Modelo Oceano Modelo terrestre Modelo de hidrología

21 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 21 Threaded Programming Methodology Descomposición por Pipeline La computación se hace en etapas independientes Descomposición funcional Los hilos se asignan a una etapa a computar Línea de ensamble de automóviles Descomposición de datos Los hilos procesan todas las etapas de una sola instancia Un trabajador construye un auto completito

22 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 22 Threaded Programming Methodology Estrategia del LAME Encoder LAME MP3 encoder Proyecto Open source Herramienta educativa El objetivo de este proyecto es Mejorar la calidad Mejorar la velocidad de la codificación a MP3

23 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 23 Threaded Programming Methodology Estrategia de LAME Pipeline Frame N Frame N + 1 Time Otro N Preludio N Acústicos N Codificación N T 2 T 1 Acústicos N+1 Preludio N+1 Otro N+1 Codificación N+1 Acústicos N+2 Preludio N+2 T 3 T 4 Preludio N+3 Hierarchical Barrier OtroPreludioAcústicosCodificación Frame Extraer siguiente frame Caracterización del frame Poner parámetros del encoder Analisis FFT long/short Ensamblar el filtro Aplicar filtros Suprimir ruidos Cuantiza y cuenta bits Agregar encabezado del frame Verificar si es correcto Escribe al disco

24 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 24 Threaded Programming Methodology Diseño ¿Cuál es el beneficio esperado? ¿Cómo logramos esto con el menor esfuerzo? ¿Cuánto se lleva paralelizar? ¿Cuánto esfuerzo se requiere para rediseñar? Prototipo rápido con OpenMP Aceleración(2P) = 100/(96/2+4) = ~1.92X

25 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 25 Threaded Programming Methodology OpenMP Paralelismo Fork-join: El hilo maestro se divide en un grupo de hilos como sea necesario El paralelismo va incrementando Un programa secuencial evoluciona a un programa paralelo Regiones Paralelas Hilo maestro

26 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 26 Threaded Programming Methodology Diseño #pragma omp parallel for for( int i = start; i <= end; i+= 2 ){ if( TestForPrime(i) ) globalPrimes[gPrimesFound++] = i; ShowProgress(i, range); } OpenMP Crea hilos aquí para Esta región paralela for Divide iteraciones de el ciclo for

27 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 27 Threaded Programming Methodology Actividad 3 Ejecuta la versión OpenMP del código Localiza el directorio PrimeOpenMP y la solución Compila el código Ejecuta con para comparar ¿Cuál es la aceleración?

28 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 28 Threaded Programming Methodology Diseño ¿Cuál es el beneficio esperado? ¿Cómo logras esto con el menor esfuerzo? ¿Cuánto tiempo se llevó paralelizar? ¿Cuánto esfuerzo se requiere para rediseñar? ¿Es la mejor aceleración posible? Aceleración de 1.40X (menor que 1.92X)

29 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 29 Threaded Programming Methodology Depuración ¿Es la implementación correcta de paralelismo? No! Los resultados son diferentes cada ejecución …

30 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 30 Threaded Programming Methodology Depuración Intel® Thread Checker señana errores notorios en al paralelizar como condiciones de concurso, stalls e interbloqueos Intel ® Thread Checker VTune Performance Analyzer Instrumentación Binaria Primes.exe (Instrumentado) Colector de datos en tiempo de ejecución threadchecker.thr (archivo resultante) +DLLs (Instrumentado)

31 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 31 Threaded Programming Methodology Thread Checker

32 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 32 Threaded Programming Methodology Actividad 4 Usa Thread Checker para analizar la aplicación paralelizada Crear una actividad Thread Checker activity Ejecuta la aplicación ¿Se reportan errores?

33 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 33 Threaded Programming Methodology Depuración ¿Qué tanto esfuerzo se requiere para rediseñar? ¿Cuánto tiempo llevará paralelizar? Thread Checker reportó solo 2 dependencias, por lo tanto el esfuerzo necesario debe ser bajo

34 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 34 Threaded Programming Methodology Depuración #pragma omp parallel for for( int i = start; i <= end; i+= 2 ){ if( TestForPrime(i) ) #pragma omp critical globalPrimes[gPrimesFound++] = i; ShowProgress(i, range); } #pragma omp critical { gProgress++; percentDone = (int)(gProgress/range *200.0f+0.5f) } Creará una sección crítica para esta referencia Creará una sección crítica para ambas referencias

35 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 35 Threaded Programming Methodology Actividad 5 Modifica y ejecuta la versión del código OpenMP Añade pragmas de regiones críticas al código Compila el código Ejecuta dentro del Thread Checker Si aun hay errores, haz las correcciones adecuadas al código y ejecútalas nuevamente en el Thread Checker Ejecuta con para fines de comparación Compila y ejecuta fuera del Thread Checker ¿Cuál es la aceleración?

36 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 36 Threaded Programming Methodology Depuración 1.33 Respuesta correcta, pero el rendimiento bajo al ~1.33X ¿Es lo mejor que podemos esperar de este algoritmo? No! De acuerdo a la Ley de Amdahl, podemos esperar una aceleración cerca de 1.9X

37 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 37 Threaded Programming Methodology Problemas comunes de rendimiento Sobrecarga en paralelo Dada por la creación de hilos, planificación… Sincronización Datos globales excesivos, contención de los mismos objetos de sincronización Carga desbalanceada Distribución no adecuada del trabajo en paralelo Granularidad No hay suficiente trabajo paralelo

38 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 38 Threaded Programming Methodology Afinando para mejorar rendimiento Thread Profiler señala cuellos de botella en aplicaciones paralelas Thread Profiler Thread Profiler VTune Performance Analyzer Instrumentación Binaria Primes.c Primes.exe (Instrumentado) Colector de datos en tiempo de ejecución Bistro.tp/guide.gvs (archivo de resultados) Compilador Instrumentación fuente Primes.exe /Qopenmp_profile +DLLs (Instrumentado)

39 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 39 Threaded Programming Methodology Thread Profiler para OpenMP

40 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 40 Threaded Programming Methodology Thread Profiler para OpenMP Gráfica de aceleración Estima la aceleración al paralelizar y aceleración potencial – – Basada en la ley de Amdahl Da las fronteras inferiores y superiores

41 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 41 Threaded Programming Methodology Thread Profiler para OpenMP serial paralelo

42 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 42 Threaded Programming Methodology Thread Profiler para OpenMP Thread 0Thread 1Thread 2Thread 3

43 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 43 Threaded Programming Methodology Thread Profiler (para Hilos Explicitos)

44 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 44 Threaded Programming Methodology Thread Profiler (para Hilos Explicitos) ¿Porqué demasiadas transiciones?

45 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 45 Threaded Programming Methodology Rendimiento Esta implementación tiene llamadas de sincronización implícitas Esto limita la expansión del rendimiento debido a los cambios de contexto resultantes Regreso a la etapa de diseño

46 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 46 Threaded Programming Methodology Actividad 6 Utilizar Thread Profiler para analizar una aplicación paralelizada Usar /Qopenmp_profile para compilar y encadenar Crear actividad Thread Profiler Activity (for explicit threads) Ejecuta la aplicación en el Thread Profiler Encuentra la línea en el código fuente que está causando que los hilos estén inactivos

47 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 47 Threaded Programming Methodology Rendimiento ¿Es esto mucha contención esperada? El algoritmo tiene mucho más actualizaciones que las 10 necesarias para mostrar el progreso void ShowProgress( int val, int range ) { int percentDone; gProgress++; percentDone = (int)((float)gProgress/(float)range*200.0f+0.5f); if( percentDone % 10 == 0 ) printf("\b\b\b\b%3d%", percentDone); } void ShowProgress( int val, int range ) { int percentDone; static int lastPercentDone = 0; #pragma omp critical { gProgress++; percentDone = (int)((float)gProgress/(float)range*200.0f+0.5f); } if( percentDone % 10 == 0 && lastPercentDone < percentDone / 10){ printf("\b\b\b\b%3d%", percentDone); lastPercentDone++; } Este cambio debe arreglar el problema de contención

48 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 48 Threaded Programming Methodology Diseño Metas Elimina la contención implícita debido a la sincronización 2.32X Aceleración es 2.32X ! ¿Es correcto?

49 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 49 Threaded Programming Methodology Rendimiento Nuestra línea base de medición ha viciado el algoritmo de actualización del progreso ¿Es lo mejor que podemos esperar de este algoritmo? 1.40X La velocidad actual es 1.40X (<<1.9X)!

50 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 50 Threaded Programming Methodology Actividad 7 Modifica la función ShowProgress (serial y OpenMP) para que muestre solo la salida necesitada Recompila y ejecuta el código Asegurarse que no se usan banderas de instrumentación ¿Cuál es la aceleración de la versión serial? if( percentDone % 10 == 0 && lastPercentDone < percentDone / 10){ printf("\b\b\b\b%3d%", percentDone); lastPercentDone++; }

51 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 51 Threaded Programming Methodology Revisando el Rendimiento Sigue teniendo 62% de tiempo de ejecución en locks y sinchronización

52 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 52 Threaded Programming Methodology Revisando el Rendimiento Veamos los Locks de OpenMP… void FindPrimes(int start, int end) { // start is always odd int range = end - start + 1; #pragma omp parallel for for( int i = start; i <= end; i += 2 ) { if( TestForPrime(i) ) #pragma omp critical globalPrimes[gPrimesFound++] = i; ShowProgress(i, range); } El lock está en un ciclo void FindPrimes(int start, int end) { // start is always odd int range = end - start + 1; #pragma omp parallel for for( int i = start; i <= end; i += 2 ) { if( TestForPrime(i) ) globalPrimes[InterlockedIncrement(&gPrimesFound)] = i; ShowProgress(i, range); }

53 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 53 Threaded Programming Methodology Revisando el Rendimiento Veamos el segundo lock void ShowProgress( int val, int range ) { int percentDone; static int lastPercentDone = 0; #pragma omp critical { gProgress++; percentDone = (int)((float)gProgress/(float)range*200.0f+0.5f); } if( percentDone % 10 == 0 && lastPercentDone < percentDone / 10){ printf("\b\b\b\b%3d%", percentDone); lastPercentDone++; } Este lock también está siendo llamado dentro de un ciclo void ShowProgress( int val, int range ) { long percentDone, localProgress; static int lastPercentDone = 0; localProgress = InterlockedIncrement(&gProgress); percentDone = (int)((float)localProgress/(float)range*200.0f+0.5f); if( percentDone % 10 == 0 && lastPercentDone < percentDone / 10){ printf("\b\b\b\b%3d%", percentDone); lastPercentDone++; }

54 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 54 Threaded Programming Methodology Actividad 8 Modifica las regiones críticas de OpenMP para reemplazarlas InterlockedIncrement Re-compila y ejecuta el código ¿Cuál es la aceleración con respecto a la versión serial?

55 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 55 Threaded Programming Methodology Thread Profiler para OpenMP Hilo factores para probar Hilo factores para probar Hilo factores para probar Hilo factores para probar

56 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 56 Threaded Programming Methodology Arreglando la Carga Desbalanceada Distribuye el trabajo más equitativamente void FindPrimes(int start, int end) { // start is always odd int range = end - start + 1; #pragma omp parallel for schedule(static, 8) for( int i = start; i <= end; i += 2 ) { if( TestForPrime(i) ) globalPrimes[InterlockedIncrement(&gPrimesFound)] = i; ShowProgress(i, range); } 1.68X La aceleración lograda es 1.68X

57 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 57 Threaded Programming Methodology Actividad 9 Modifica el código para mejorar el balanceo de carga Agrega la cláusula schedule (static, 8) en el pragma parallel for de OpenMP Re-compila y ejecuta código ¿Cuál es la aceleración con respecto al código serial?

58 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 58 Threaded Programming Methodology Ejecución final del Thread Profiler 1.80X La aceleración lograda es 1.80X

59 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 59 Threaded Programming Methodology Análisis Comparativo Las aplicaciones paralelas requieren varias iteraciones al pasar por el ciclo de desarrollo de software

60 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 60 Threaded Programming Methodology Metodología de programación paralela Lo que se Cubrió Cuatro pasos del ciclo de desarrollo para escribir aplicaciones paralelas desde el código serial y las herramientas de Intel® para soportar cada paso Análisis Diseño (Introducir Hilos) Depurar para la correctud Afinar el rendimiento Las aplicaciones paralelas requieren múltiples iteraciones de diseño, depuración y afinación de rendimiento Usar las herramientas para mejorar productividad

61 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 61 Threaded Programming Methodology

62 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 62 Threaded Programming Methodology Diapositivas Adicionales

63 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 63 Threaded Programming Methodology Sobrecarga en paralelo Sobrecarga de creación de los hilos La sobrecarga incrementa conforme incrementa el número de hilos activos Solución Uso de hilos reusables y thread pools Amortiza el costo de crear hilos Mantiene el número de hilos activos relativamente constante

64 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 64 Threaded Programming Methodology Sincronización Contención por asignación dinámica de memoria La asignación dinámica de memoria causa sincronización implícita Asignar en el stack para usar almacenamiento local en los hilos Actualizaciones atómicas versus secciones críticas Algunas actualizaciones de datos globales pueden usar operaciones (familia Interlocked) Usar actualizaciones atómicas cada que sea posible Secciones Críticas versus exclusión mutua Los objetos de Sección Crítica residen en el espacio del usuario Usar objetos CRITICAL SECTION cuando no se requiere visibilidad más allá de los límites del proceso Introduce menos sobrecarga Tiene una variante de spin-wait que es útil para algunas aplicaciones

65 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 65 Threaded Programming Methodology Trabajo no balanceado Cargas de trabajo desigual nos llevan a hilos ociosos y tiempo desperdiciado Tiempo Ocupado Ocioso

66 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. 66 Threaded Programming Methodology Granularidad Grano grueso Grano fino Porción paralelizable Serial Porción paralelizable Serial Escala: ~3X Escala: ~1.10X Escala: ~2.5X Escala: ~1.05X


Descargar ppt "Metodología de programación paralela Intel Software College."

Presentaciones similares


Anuncios Google