La descarga está en progreso. Por favor, espere

La descarga está en progreso. Por favor, espere

Primer Parcial -> Tema 1 Minería de Datos Universidad del Cauca.

Presentaciones similares


Presentación del tema: "Primer Parcial -> Tema 1 Minería de Datos Universidad del Cauca."— Transcripción de la presentación:

1 Primer Parcial -> Tema 1 Minería de Datos Universidad del Cauca

2  Existen muchos métodos para tratar con el problema de la Selección de Instancias (SI).  El más conocido es un algoritmo greedy denominado Condensed Nearest Neighbor Rule (CNN).  CNN construye un subconjunto S del conjunto de entrenamiento T tal que todo ejemplo de T está más cerca a un ejemplo de S de la misma clase que a otro de S de clase distinta.

3  El algoritmo comienza seleccionando una instancia de cada clase de T y las inserta en S.  Después, cada instancia de T se clasifica con 1-NN usando solamente las instancias que haya en S.  Si una instancia no se clasifica bien, se añade a S, asegurando que se va a clasificar correctamente.  Este proceso se repite hasta que no haya instancias en T que se clasifiquen incorrectamente.

4  Ejemplo: Diseño de un Clasificador para Iris  Problema simple muy conocido: clasificación de lirios.  Tres clases de lirios: setosa, versicolor y virginica.  Cuatro atributos: longitud y anchura de pétalo y sépalo, respectivamente.  150 ejemplos, 50 de cada clase.  Disponible en: http://archive.ics.uci.edu/ml/datasets/Iris http://archive.ics.uci.edu/ml/datasets/Iris Setosa Versicolor virginica

5  Ejemplos de conjuntos seleccionados sobre Iris:  Reducción: 0%. Reducción: 97,78%. Acierto Test: 95,33 % Acierto Test: 93,33%

6  Condensed Nearest Neighbour (CNN) Hart 1968  Incremental  Order dependent  Neither minimal nor decision boundary consistent  O(n 3 ) for brute-force method  Can follow up with reduced NN [Gates72] Remove a sample if doing so does not cause any incorrect classifications 1.Initialize subset with a single training example 2.Classify all remaining samples using the subset, and transfer any incorrectly classified samples to the subset 3.Return to 2 until no transfers occurred or the subset is full Produces consistent set

7  Condensed Nearest Neighbour (CNN) Hart 1968  Incremental  Order dependent  Neither minimal nor decision boundary consistent  O(n 3 ) for brute-force method  Can follow up with reduced NN [Gates72] Remove a sample if doing so does not cause any incorrect classifications 1.Initialize subset with a single training example 2.Classify all remaining samples using the subset, and transfer any incorrectly classified samples to the subset 3.Return to 2 until no transfers occurred or the subset is full

8  Condensed Nearest Neighbour (CNN) Hart 1968  Incremental  Order dependent  Neither minimal nor decision boundary consistent  O(n 3 ) for brute-force method  Can follow up with reduced NN [Gates72] Remove a sample if doing so does not cause any incorrect classifications 1.Initialize subset with a single training example 2.Classify all remaining samples using the subset, and transfer any incorrectly classified samples to the subset 3.Return to 2 until no transfers occurred or the subset is full

9  Condensed Nearest Neighbour (CNN) Hart 1968  Incremental  Order dependent  Neither minimal nor decision boundary consistent  O(n 3 ) for brute-force method  Can follow up with reduced NN [Gates72] Remove a sample if doing so does not cause any incorrect classifications 1.Initialize subset with a single training example 2.Classify all remaining samples using the subset, and transfer any incorrectly classified samples to the subset 3.Return to 2 until no transfers occurred or the subset is full

10  Condensed Nearest Neighbour (CNN) Hart 1968  Incremental  Order dependent  Neither minimal nor decision boundary consistent  O(n 3 ) for brute-force method  Can follow up with reduced NN [Gates72] Remove a sample if doing so does not cause any incorrect classifications 1.Initialize subset with a single training example 2.Classify all remaining samples using the subset, and transfer any incorrectly classified samples to the subset 3.Return to 2 until no transfers occurred or the subset is full

11  Condensed Nearest Neighbour (CNN) Hart 1968  Incremental  Order dependent  Neither minimal nor decision boundary consistent  O(n 3 ) for brute-force method  Can follow up with reduced NN [Gates72] Remove a sample if doing so does not cause any incorrect classifications 1.Initialize subset with a single training example 2.Classify all remaining samples using the subset, and transfer any incorrectly classified samples to the subset 3.Return to 2 until no transfers occurred or the subset is full

12  Condensed Nearest Neighbour (CNN) Hart 1968  Incremental  Order dependent  Neither minimal nor decision boundary consistent  O(n 3 ) for brute-force method  Can follow up with reduced NN [Gates72] Remove a sample if doing so does not cause any incorrect classifications 1.Initialize subset with a single training example 2.Classify all remaining samples using the subset, and transfer any incorrectly classified samples to the subset 3.Return to 2 until no transfers occurred or the subset is full


Descargar ppt "Primer Parcial -> Tema 1 Minería de Datos Universidad del Cauca."

Presentaciones similares


Anuncios Google