Although there is good road safety performance the number of people killed and injured on our roads remain unacceptably high. So the roads safety strategy was published or introduced to support the new casualty reduction targets. The road safety strategy includes all forms of invention based on the engineering and education and enforcement and recognizes that there are many different factors that lead to traffic collisions and casualties. The main reason is speed of vehicle. We use traffic lights and other traffic manager to reduce the speed. One among them is speed cameras.
Speed cameras on the side of urban and rural roads, usually placed to catch transgressors of the stipulated speed limit for that road. The speed cameras, the solely to identify and prosecute those drivers that pass by the them when exceed the stipulated speed limit.
At first glance this seemed to be reasonable that the road users do not exceed the speed limit must be a good thing because it increases road safety, reduces accidents and protect other road users and pedestrians.
So speed limits are good idea. To enforce these speed limit; laws are passed making speed an offence and signs are erected were of to indicate the maximum permissible speeds. The police can't be every where to enforce the speed limit and so enforcement cameras art director to do this work; on one who's got an ounce of Commons sense, the deliberately drive through speed camera in order fined and penalized .
So nearly everyone slowdown for the speed Camera. We finally have a solution to the speeding problem. Now if we are to assume that speed cameras are the only way to make driver's slowdown, and they work efficiently, then we would expect there to be a great number of these every were and that day would be highly visible and identifiable to make a drivers slow down.
This seminar presents algorithms for vision-based detection and classification of vehicles in monocular image sequences of traffic scenes recorded by a stationary camera. Processing is done at three levels: raw images, region level and vehicle level. Vehicles are modeled as rectangular patches with certain dynamic behavior. The proposed method is based on the establishment of correspondences between regions and vehicles, as the vehicles move through the image sequence. Experimental results from highway scenes are provided which demonstrate the effectiveness of the method. An interactive camera calibration tool is used for recovering the camera parameters using features in the image selected by the user.
A system that automatically capturing image of a moving vehicle and recording data parameters, such as date, time, speed operator, location, etc. on the image. A capture window that comprises a predetermined range of distances of the system from the moving vehicle can be set by the operator so that the image of the moving vehicle is automatically captured when it enters the capture window. The capture window distance can be entered manually through a keyboard or automatically using the laser speed gun. Automatic focusing is provided using distance information from the laser speed gun. Traffic management and information systems rely on a suite of sensors for estimating traffic parameters. Magnetic loop detectors are often used to count vehicles passing over them. Vision-based video monitoring systems offer a number of advantages. In addition to vehicle counts, a much larger set of traffic parameters such as vehicle classifications, lane changes, etc. can be measured. Besides, cameras are much less disruptive to install than loop detectors. Vehicle classification is important in the computation of the percentages of vehicle classes that use state-aid streets and highways. The current situation is described by outdated data and often, human operators manually count vehicles at a specific street. The use of an automated system can lead to accurate design of pavements (e.g., the decision about thickness) with obvious results in cost and quality. Even in large metropolitan areas, there is a need for data about vehicle classes that use a particular street. A classification system can provide important data for a particular design scenario. Here system uses a single camera mounted on a pole or other tall structure, looking down on the traffic scene. It can be used for detecting and classifying vehicles in multiple lanes and for any direction of traffic flow. The system requires only the camera calibration parameters and direction of traffic for initialization. The seminar starts by describing a camera calibration tool, experimental results are presented, and finally conclusions are drawn.
1. VEHICLE TRACKING
2. VEHICLE DETECTION
3. VEHICLE CLASSIFIACTION
4. CAMERA CALIBRATION
The first step in detecting vehicles is segmenting the image to separate the vehicles from the background. There are various approaches to this, with varying degrees of effectiveness.
REQUIREMENTS FOR SEGMENTATION:
1. It should accurately separate vehicles from the background.
2. It should be fast enough to operate in real time.
3. It should be insensitive to lighting and weather conditions.
4. It should require a minimal amount of initialization.
? The expectation maximization (EM) method to classify each pixel as moving object.
? Kalman filtering is used to predict the background image during the next update interval. The error between the prediction and the actual background image is used to update the Kalman filter state variables.
? This method has the advantage that it automatically adapts to changes in lighting and weather conditions.
? However, it needs to be initialized with an image of the background without any vehicles present.
TIME DIFFERENCING APPROACH
? It consists of subtracting successive frames (or frames a fixed interval apart).
? This method is insensitive to lighting conditions and has the advantage of not requiring initialization with a background image.
? However, this method produces many small regions that can be difficult to separate from noise.
SELF-ADAPTIVE BACKGROUND SUBTRACTION
A self-adaptive background subtraction method is used for segmentation. This method automatically extracts the background from a video sequence and so manual initialization is not required. This segmentation technique consists of three tasks:
For each frame of the video sequence (referred to as current image), we take the difference between the current image and the current background giving the difference image. The difference image is thresholded to give a binary object mask. The object mask is a binary image such that all pixels that correspond to foreground objects have the value 1, and all the other pixels are set to value 0.
B. ADAPTIVE BACKGROUND UPDATE
We update the background by taking a weighted average of the current background and the current frame of the video sequence. However, the current image also contains foreground objects. Therefore, before we do the update we need to classify the pixels as foreground and background and then use only the background pixels from the current image to modify the current background.
? BINARY OBJECT MASS AS GATING FUNCTION:
The binary object mask is used to distinguish the foreground pixels from the background pixels. The object mask decides which image to sample for updating the background. At those locations where the mask is 0 (corresponding to the background pixels), the current image is sampled. At those locations where the mask is 1 â€œ corresponding to foreground pixels â€œ the current background is sampled.
? BACKGROUND UPDATE:
The result of this is what we call the instantaneous background. The current background is set to be the weighted average of the instantaneous and the current background:
CB =aIB + (1-a) CB. (1)
? ESTIMATION OF WEIGHT:
The weights assigned to the current and instantaneous background affect the update speed. We want the update speed to be fast enough so that changes in brightness are captured quickly, but slow enough so that momentary changes do not persist for an unduly long amount of time. The weight has been empirically determined to be 0.1.This gives the best tradeoff in terms of update speed and insensitivity to momentary changes.
COMPUTATION OF THE INSTANTANEOUS BACKGROUND
C. DYNAMIC THRESHOLD UPDATE
After subtracting the current image from the current background, the resultant difference image has to be thresholded to get the binary object mask. Since the background changes dynamically, a static threshold cannot be used to compute the object mask. Moreover, since the object mask itself is used in updating the current background, a poorly set threshold would result in poor segmentation. Therefore we need a way to update the threshold as the current background changes. The difference image is used to update the threshold. In our images, a major portion of the image consists of the background. Therefore the difference image would consist of a large number of pixels having low values, and a small number of pixels having high values. We use this observation in deciding the threshold. The histogram of the difference image will have high values for low pixel intensities and low values for the higher pixel intensities. To set the threshold, we need to look for a dip in the histogram that occurs to the right of the peak. Starting from the pixel value corresponding to the peak of the histogram, we search towards increasing pixel intensities for a location on the histogram that has a value significantly lower than the peak value (we use 10% of the peak value). The corresponding pixel value is used as the new threshold.
D. AUTOMATIC BACKGROUND EXTRACTION
In video sequences of highway traffic it might be impossible to acquire an image of the background. A method that can automatically extract the background from a sequence of video images would be very useful. It is assumed that the background is stationary and any object that has significant motion is considered part of the foreground. The method works with video images gradually build up the background image over time. The background and threshold updating described above is done at periodic update intervals. To extract the background, we compute a binary motion mask by subtracting images from two successive update intervals. All pixels that have moved between these update intervals are considered part of the foreground. To compute the motion mask for frame i (MMi), the binary object masks from update interval i (OMi) and update interval i-1 (OMi-1) are used. The motion mask is computed as:
MMi = ~OMi-1 & OMi. (2)
This motion mask is now used as the gating function to compute the instantaneous background as described above. Over a sequence of frames the current background looks similar to the background in the current image.
E. SELF-ADAPTIVE BACKGROUND SUBTRACTION
(A) INITIAL BACKGROUND PROVIDED TO THE ALGORITHM.
(B) IMAGE OF THE SCENE AT DUSK.
© CURRENT BACKGROUND AFTER 4 S.
(D) CURRENT BACKGROUND AFTER 6 S.
(E) CURRENT BACKGROUND AFTER 8 S.
BACKGROUND ADAPTATION TO CHANGES IN LIGHTING CONDITIONS.
Self-Adaptive Background Subtraction Results shows some images that demonstrate the effectiveness of our self-adaptive background subtraction method. The image (a) was taken during the day. This was given as the initial background to the algorithm. The image (b) shows the same image at dusk. The images ©, (d), and (e) show how the background adaptation algorithm updates the background so that it closely matches the background of image (b). 3. REGION TRACKING:
A vision-based traffic monitoring system needs to be able to track vehicles through the video sequence. Tracking eliminates multiple counts in vehicle counting applications. Moreover, the tracking information can also be used to derive other useful information like vehicle velocities. In applications like vehicle classification, the tracking information can also be used to refine the vehicle type and correct for errors caused due to occlusions. The output of the segmentation step is a binary object mask. We perform region extraction on this mask. In the region tracking step, we want to associate regions in frame i with the regions in frame i+1. This allows us to compute the velocity of the region as it moves across the image and also helps in the vehicle tracking stage. There are certain problems that need to be handled for reliable and robust region tracking. When considering the regions in frame i and frame i+1 the following problems might occur:
A region might disappear. Some of the reasons why this may happen are:
? The vehicle that corresponded to this region is no longer visible in the image, and hence its region disappears.
? Vehicles are shiny metallic objects. The pattern of reflection seen by the camera changes as the vehicles move across the scene. The segmentation process uses thresholding, which is prone to noise. At some point in the scene, the pattern of reflection from a vehicle might fall below the threshold and hence those pixels will not be considered as foreground. Therefore the region might disappear even though the vehicle is still visible.
? A vehicle might become occluded by some part of the background or another vehicle.
A new region might appear. Some possible reasons for this are:
? A new vehicle enters the field of view of the camera and so a new region corresponding to this vehicle appears.
? For the same reason as that mentioned above, as the pattern of reflections from a vehicle changes, itâ„¢s intensity might now rise above the threshold used for segmentation, and the region corresponding to this vehicle is now detected.
? A previously occluded vehicle might become not occluded.
A single region in frame i-1 might split into multiple regions in frame i because:
frame i-1 frame i
Previous region P Current region C
? Two or more vehicles might have been passing close enough to each other that they occlude (or are occluded) and hence are detected as one connected region. As these vehicles move apart and are not occluded, the region corresponding to these vehicles might split up into multiple regions.
? Due to noise and errors during the thresholding process, a single vehicle that was detected as a single region might be detected as multiple regions as it moves across the image.
Multiple regions may merge. Some reasons why this may occur are: Multiple vehicles (each of which were detected as one or more regions) might occlude each other and during segmentation get detected as a single region.
? Due to errors in thresholding, a vehicle that was detected as multiple regions might later be detected as a single region.
We form an association graph between the regions from the previous frame and the regions in the current frame. We model the region tracking problem as a problem of finding the maximal weight graph. The association graph is a bipartite graph where each vertex corresponds to a region. All the vertices in one partition of this graph correspond to regions from the previous frame, P and all the vertices in the other partition correspond to regions in the current frame, C. An edge Eij between vertices Vi and Vj indicates that the previous region Pi is associated with the current region Cj. A weight w is assigned to each edge Eij. The weight of edge Eij is calculated as
w (Eij) =A (PinCj)
i.e., the weight of edge Eij is the area of overlap between region Pi and region Cj.
BUILDING THE ASSOCIATION GRAPH
The region extraction step is done for each frame resulting in new regions being detected. These become the current regions, C. The current regions from frame i become the previous regions P in frame i+1. To add the edges in this graph, a score is computed between each previous region Pi and each current region Cj. The score s is a pair of values . It is a measure of how closely a previous region Pi matches a current region Cj. The area of intersection between Pi and Cj is used in computing
sp c= A (PinCj)
sc p= A (PinCj)
This makes the score s independent of the actual area of both regions Pi and Cj.
Each previous region Pi is compared with each current region Cj and the area of intersection between Pi and Cj is computed. The current region Cimax that has the maximum value for sp c with Pi is determined. An edge is added between Pi and Cimax. Similarly, for each region Cj, the previous region Pjmax that has the maximum value for sc p with Cj is determined. An edge is added between vertices Pjmax and Cj. The rationale for having a two-part score is that it allows us to handle region splits and merges correctly. Moreover, by always selecting the region Cimax (Pjmax) that has the maximum value for sp c (sc p) we do not need to set any arbitrary thresholds to determine if an edge should be added between two regions. This also ensures that the resultant association graph generated is a maximal weight graph.
When the edges are added to the association graph as described above, we might possibly get a graph of the form shown in Figure. In this case, P0 can be associated with C0 or C1, or both C0 and C1 (similarly, for P1). To be able to use this graph for tracking we need to choose one assignment from among these. We enforce the following constraint on the association graph â€œ in every connected component of the graph only one vertex may have degree greater than 1. A graph that meets this constraint is considered a conflict-free graph. A connected component that does not meet this constraint is considered a conflict component. For each conflict component we add edges in increasing order of weight if and only if adding the edge does not violate the constraint mentioned above. If adding an edge Eij will violate the constraint, we simply ignore the edge and select the next one. The resulting graph may be sub-optimal (in terms of weight); however, this does not have an unduly large effect on the tracking and is good enough in most cases.
4. RECOVERY OF VEHICLE PARAMETERS:
To be able to detect and classify vehicles, the location, length, width and velocity of the regions (which are vehicle fragments) needs to be recovered from the image. Knowledge of camera calibration parameters is necessary in estimating these attributes. Accurate calibration can therefore significantly impact the computation of vehicle velocities and classification. Calibration parameters are usually difficult to obtain from the scene as they are rarely measured when the camera is installed. Moreover, since the cameras are installed approximately 20-30 feet above the ground, it is usually difficult to measure certain quantities such as pan and tilt that can help in computing the calibration parameters. Therefore, it becomes difficult to calibrate after the camera has been installed. One way to compute the camera parameters is to use known facts about the scene. For example, we know that the road, for the most part, is restricted to a plane. We also know that the lane markings are parallel and lengths of markings as well as distances between those markings are precisely specified. Once the camera parameters are computed, any point on the image can be back-projected onto the road. Therefore, we have a way of finding the distance between any two points on the road by knowing their image locations. The system can then compute the calibration parameters automatically. The proposed system is easy to use and intuitive to operate, using obvious landmarks, such as lane markings, and familiar tools, such as a linedrawing tool. The Graphical User Interface (GUI) allows the user to first open an image of the scene. The user is then able to draw different lines and optionally assign lengths to those lines. The user may first draw lines that represent lane separation. They may then draw lines to designate the width of the lanes. The user may also designate known lengths in conjunction with the lane separation marks. An additional feature of the interface is that it allows the user to define traffic lanes in the video, and also the direction of traffic in these lanes. Also, special hot spots can be indicated on the image, such as the location where we want to compute vehicles' speeds. The only real difficulty arose with respect to accuracy in determining distances in the direction of the road. Some of these inaccuracies arise because the markings on the road themselves are not precise. Another part of the inaccuracy depends on the userâ„¢s ability to mark endpoints in the image.
5. VEHICLE IDENTIFICATION
A vehicle is made up of (possibly multiple) regions. The vehicle identification stage groups regions together to form vehicles. New regions that do not belong to any vehicle are considered orphan regions. A vehicle is modeled as a rectangular patch whose dimensions depend on the dimensions of its constituent regions. Thresholds are set for the minimum and maximum sizes of vehicles based on typical dimensions of vehicles. A new vehicle is created when an orphan region of sufficient size is tracked over a sequence of a number of frames.
6. VEHICLE TRACKING
Our vehicle model is based on the assumption that the scene has a flat ground. A vehicle is modeled as a rectangular patch whose dimensions depend on its location in the image. The dimensions are equal to the projection of the vehicle at the corresponding location in the scene. A vehicle consists of one or more regions, and a region might be owned by zero or more vehicles. The region tracking stage produces a conflict-free association graph that describes the relations between regions from the previous frame and regions from the current frame. The vehicle tracking stage updates the location, velocity and dimensions of each vehicle based on this association graph. The location and dimensions of a vehicle are computed as the bounding box of all its constituent blobs. The velocity is computed as the weighted average of the velocities of its constituent blobs. The weight for a region Pi ? vehicle v is calculated as: is the area of overlap between vehicle v and region Pi. The vehicleâ„¢s velocity is used to predict its location in the next frame. A region can be in one of five possible states. The vehicle tracker performs different actions depending on the state of each region that is owned by a vehicle. The states and corresponding actions performed by the tracker are:
1. Update: A previous region Pi matches exactly one current region Cj. The tracker simply updates the ownership relation so that the vehicle v that owned Pi now owns Cj.
2. Merge: Regions Pi Â¦. Pk merges into a single region Cj. The area of overlap between each vehicle assigned to Pi Â¦ Pk is computed with Cj, if the overlap is above a minimum threshold, Cj is assigned to that vehicle.
3. Split: Region Pi splits into regions Cj Â¦ Ck. Again the area of overlap between each vehicle v ? Pi is computed with Cj Â¦ Ck. If it is greater than a minimum value, the region is assigned to v.
4. Disappear: A region Pi P is not matched by any Cj C. The region is simply removed from all the vehicles that owned it. If a vehicle loses all its regions, it becomes a phantom vehicle. Sometimes a vehicle may become temporarily occluded and then later reappear. Phantoms prevent such a non-occluded vehicle from being considered a new vehicle. A phantom is kept around for a few frames (3), and if it cannot be resurrected within this time, it is removed.
5. Appear: A region Cj C does not match any Pi P. We check Cj with the phantom vehicles. If a phantom vehicle overlaps new region(s) of sufficient area, it is resurrected. If the region does not belong to a phantom vehicle and is of sufficient size, a new vehicle is created.
7. VEHICLE CLASSIFICATION:
To be useful, a vehicle classification system should categorize vehicles into a sufficiently large number of classes. However as the number of categories increases, the processing time required to do the classification also increases. Therefore, a hierarchical classification method is needed which can quickly categorize vehicles at a coarse granularity. Then depending on the application, further classification at the desired level of granularity can be done. We use vehicle dimensions to classify vehicles into two categories: cars (which constitute the majority of vehicles) and non-cars (vans, SUVs, pickup trucks, tractor-trailers, semis and buses). Separating, say SUVs from pickup trucks would require the use of more sophisticated, shape-based techniques. However, doing a coarse, dimension-based classification at the top-level significantly reduces the amount of work that needs to be done at a lower level. The final goal of our system is to be able to do a vehicle classification at multiple levels of granularity but currently, we classify vehicles into the aforementioned categories (based on the needs of the funding agency). Since classification is done based on the dimensions of vehicles, we compute the actual length and height of the vehicles. Due to the camera orientation, the computed height is actually a combination of the vehicleâ„¢s width and height. It is not possible to separate the two using merely vehicle boundaries in the image and camera parameters. The category of a vehicle is determined from its length and this combined value. We took a sample of 50 cars and 50 trucks and calculated the mean and variance of these samples. We used the combined width/height value for the vehicle height computed using the camera calibration parameters. From these samples, we were able to compute a discriminant function that can be used to classify vehicles. The average dimensions of a truck are only slightly larger than the dimensions of the average car. In some cases, cars may actually be longer and wider than trucks (i.e., a Cadillac vs. a small pickup). This ambiguity allows some error to enter the system when we define a decision boundary.
The system was implemented on a dual Pentium 400 MHz PC equipped with a C80 Matrox Genesis vision board. The processing is done at a frame rate of 15 fps. With more optimized algorithms, the processing time per frame could be reduced significantly. Errors in detection were most frequently due to occlusions and/or poor segmentation. Because of imperfections in updating the background, noise can be added or subtracted from the detected vehicles. At times, the noise is sufficient enough to cause the detected object to become too large or too small to be considered a vehicle. Also, when multiple vehicles occlude each other, they are often detected as a single vehicle. However, if the vehicles later move apart, the tracker is robust enough to correctly identify them as separate vehicles. Unfortunately, the two vehicles will continue to persist as a single vehicle if the relative motion between them is small. In such a case, the count of vehicles is incorrect. Another thing to note is that the images we used are grayscale. Since our segmentation approach is intensity based, vehicles whose intensity is similar to the road surface are sometimes missed, or detected as fragments that are too small to be reliably separated from noise. This too will cause an incorrect vehicle count. Classification errors were mostly due to the small separation between vehicle classes. Because size is used as metric, it is impossible to correctly classify all vehicles. The algorithm is general enough to work with multiple traffic directions. Also, data was acquired on an overcast day thus removing the problem of shadows.
9. CONCLUSION AND FUTURE WORK:
We have presented a model-based vehicle tracking and classification system capable of working robustly under most circumstances. The system is general enough to be capable of detecting, tracking and classifying vehicles while requiring only minimal scene-specific knowledge. In addition to the vehicle category, the system provides location and velocity information for each vehicle as long as it is visible. Initial experimental results from highway scenes were presented. To enable classification into a larger number of categories, we intend to use a non-rigid model-based approach to classify vehicles. Parameterized 3D models of exemplars of each category will be used. Given the camera calibration a 2D projection of the model will be formed from this viewpoint. This projection will be compared with the vehicles in the image to determine the class of the vehicle.
 K.D. Baker and G.D. Sullivan, Performance assessment of model-based tracking, in Proc. of the IEEE Workshop on Applications of Computer
Vision, pp. 28-35, Palm Springs, CA, 1992.
 D. Beymer, P. McLauchlan, B. Coifman, and J. Malik, A real-time computer vision system for measuring traffic parameters, in Proc. of IEEE Conf. Computer Vision and Pattern Recognition, pp. 496-501, Puerto Rico, June 1997.