School of Computer Science and Information Technology University of Nottingham Jubilee Campus NOTTINGHAM NG BB UK Computer Science Technical Report No

School of Computer Science and Information Technology University of Nottingham Jubilee Campus NOTTINGHAM NG BB UK Computer Science Technical Report No - Description

NOTTCSTR20011 Arc Segmentation in Engineering Drawings Dave Elliman First released March 2001 Copyright 2001 Dave Elliman In an attempt to ensure goodquality printouts of our technical reports from the supplied PDF files we process to PDF using Acr ID: 24333 Download Pdf

206K - views

School of Computer Science and Information Technology University of Nottingham Jubilee Campus NOTTINGHAM NG BB UK Computer Science Technical Report No

NOTTCSTR20011 Arc Segmentation in Engineering Drawings Dave Elliman First released March 2001 Copyright 2001 Dave Elliman In an attempt to ensure goodquality printouts of our technical reports from the supplied PDF files we process to PDF using Acr

Similar presentations

Download Pdf

School of Computer Science and Information Technology University of Nottingham Jubilee Campus NOTTINGHAM NG BB UK Computer Science Technical Report No

Download Pdf - The PPT/PDF document "School of Computer Science and Informati..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation on theme: "School of Computer Science and Information Technology University of Nottingham Jubilee Campus NOTTINGHAM NG BB UK Computer Science Technical Report No"— Presentation transcript:

Page 1
School of Computer Science and Information Technology University of Nottingham Jubilee Campus NOTTINGHAM NG8 1BB, UK Computer Science Technical Report No. NOTTCS-TR-2001-1 Arc Segmentation in Engineering Drawings Dave Elliman First released: March 2001 Copyright 2001 Dave Elliman In an attempt to ensure good-quality printouts of our technical reports, from the supplied PDF files, we process to PDF using Acrobat Distiller. We encourage our authors to use outline fonts coupled with embedding of the used subset of all fonts (in either Truetype or Type 1 formats) except for the

standard Acrobat typeface families of Times, Helvetica (Arial), Courier and Symbol. In the case of papers prepared using TEX or LATEX we endeavour to use subsetted Type 1 fonts, supplied by Y&Y Inc., for the Computer Modern, Lucida Bright and Mathtime families, rather than the public-domain Computer Modern bitmapped fonts. Note that the Y&Y font subsets are embedded under a site license issued by Y&Y Inc. For further details of site licensing and purchase of these fonts visit
Page 2
Arc Segmentation in Engineering Drawings Dave Elliman University of Nottingham UK March 9, 2001 Abstract This paper describes a method for the recognition of outlines in engineering drawing in terms of line and arc primitives. The approach is based on vectorizing a binary image, smoothing the vectors so a sequence of small straight lines, and then attempting to t arcs. Two further levels of constraints are then applied. One ensure smoothness where appropriate, which results in vectors that are pleasing to the eye, and suitable for some manufacturing processes. The next optional stage applies rules and sometimes results in perfectly correct views,

but can fail, and can introduce distortion in unusual cases. Complex drawings are not successfully processed at present due to limitation in the extraction of vector sequences at branch points. This is not a major limitation and is expected to be resolved by current research, and in time for the GREC-2001 conference. 1 Introduction Engineering drawings contain text, some special drawing symbols, outlines of views, shading, hidden detail, and dimensioning lines, some of which may have an arrowhead at one or both ends. In this work the emphasis is on processing the outlines of views, and

producing a description of these in terms of a sequence of straight-line segments and circular arcs. The method has given excellent results when tested on scanned images which contain only these primitives, or which can be closely approximated by them, for example the shape in gure 2. The approach is highly successful for scanned images of this kind, but further work is underway to make it successful for more realistic images such as those shown 3. Further work is also needed to recognise elliptic curves, although these do seem to be approximated as rather predictable sequences of

arcs, and this may not be too challenging a task. A useful description of other methods of arc segmentation applied to engineering drawings can be found in [2]. Many authors have used the Hough transform to identify arcs in images of all types. An excellent reference on the approach is that of Illingworth and Kittler [5].
Page 3
However this approach was not found to be very successful by the present authors in that the end points and centre of the arc were often predicted with poor accuracy. This may have been as a result of nding arcs after extracting smoothed vectors.

However the algorithm is very slow if applied at the pixel stage. The paper describes methods of arc tting and of enforcing smoothness and other likely constraints on the geometry of the view. These techniques seem powerful and to oer much promise, but further work is needed to apply them to more complex drawings where outlines branch. 2 The Problem to be Solved Engineering drawings can be considered to be constructed from the set of primitive symbols: primitives := curves;arrowheads;text;other symbols Here curves is used in a general sense to represent all the types of line on a

drawing, whether continuous or broken, curved or straight. A curve is fully dened by a set of attributes: curveattributes := path;thickness;pattern The path is a description of the two-dimensional locus of the curve, and this might be at a low level of abstraction in the form of a list of coordinate pairs, of a sequence of higher level geometrical entities such as straight lines, arcs, conics, or splines. These higher level description are obviously more useful in understanding the drawing, and in producing an accurate representation of its contents in some appropriate graphics

le. The thickness of a line is an intuitively obvious parameter, which oers evidence as to the meaning of the line in the drawing. For example the outline of a view is often drawn with a greater thickness than lines associated with dimensioning or shading. Lines may be broken into dashes to show hidden detail, or into a regular dot-dash pattern as with centre lines. Here only continuous lines will be considered. However the extension to broken lines is largely a case of extrapolating between end-points. The same algorithms and mathematics may then be applied. A path is considered to be

a sequence of coordinate pairs or points := ( ;y )ofthe form: path := ;p ;:::;p where when the path is closed. Such a path is likely to be the result of applying a vectorization process to the image. Early methods involved applying a thinning algorithm to a binary image so as to generate a skeleton of single pixel width. Lines could then be followed in this skeleton and perhaps smoothed to remove noise. Later methods that are more accurate and often faster are now more likely to be used, for example that of Elliman [3] or Dori [1]. Elliman's algorithm was used for the work reported here. A

higher level description of a path might be as a sequence of graphics primitives such as straight lines, arcs, ellipses or splines. These primitives can be dened in a variety of ways, ii
Page 4
but the chosen representation was: line := ( ;p That i s a a line between a start point and an end point, . An arc is dened by a tuple of three points. arc := ( ;p ;p Where is the centre of the arc. In all cases we use real numbers which are calculated to a sub-pixel accuracy wherever possible. Engineering drawings may contain curves of any arbitrary shape, especially for

moulded or sculpted components. However the limitations of ordinary machining processes result in the large majority of surfaces being flat, cylindrical, or spherical. The edges of such surfaces are represented by the curves in the views which constitute an engineering drawing. The edges of these surfaces are always straight lines, circular arcs, or ellipses. Where a view is orthogonal to a cylindrical surface the curve is an arc, and when the view is at any other angle it is an ellipse. Much more often than not a curve turns out to be a circular arc or a straight line. Thus we may model

many outlines in engineering drawings as a sequence of curves, as follows: outline := ;c ;:::c where: curve ::= line arc We dene the following access functions for a curve ) the start point of the curve ) the end point of the curve ) the centre point of the curve, or undened for a straight line Next ):= +1 when i Next ):= when which returns the next point in the curve. 3 The Algorithm Used The starting point for the recognition process is a binary image, and we have stored these in TIFF group IV format. The nal result is a vector description in terms of a sequence of

straight line and arc segments which may be written to a le in AutoCad DXF format. The binary image was then vectorized using the algorithm of Elliman [3]. This produces a representation that comprises chains of very small vectors, often a few pixels in length. Before attempting to t lines and arcs to this data it is smoothed by a simple method that removes intermediate points that do not indicate a consistent curvature, but which merely a iii
Page 5
start point or ost istant point ro or ost istant point on ar point Figure 1: The construction for putative arc

tting straddle either side of a least-squares line. This results in rather longer vectors on average, and makes the rest of the processing more ecient. The next step is an attempts to t the largest possible arc to a curve, that is one that accommodates all n points in the sequence. First a check is made to see if the curve is adequately represented as a straight line. If this is not the case then a putative arc is tted. This is formed by making the cord between the end-points of the curve and then nding the point , most distant from this line. An arc is

tted through this point and the two end points. The maximum distance of the remaining points of the curve from this arc are then calculated as shown in Figure 1. If this is less than a threshold then the arc is accepted as an adequate representation of the curve. However, If this fails an attempt is made to t the two possible arcs which contain 1 points. That is one that is found by discarding the rst point, and another that is found by discarding the last. Three possible arcs may be tried on discarding two points from an end of the curve, or one point from each end.

There are four possible arcs that may be tried by discarding three points, and ve ways of discarding four and so on. These ve ways are fairly obvious, but are listed in the table below: The algorithm terminates if there are only three points left in the chain as it is then guaranteed that an arc will be returned, but this may not be a plausible representation of the curve. The method has similarities to that of [6] but is believed to be more ecient. The worst case complexity of the algorithm occurs for the case when no arc is found. If there were points in the curve

then 4 passes of the procedure will have been attempted. On iv
Page 6
From Start From End 40 31 22 13 04 the rst pass two putative arcs will have been tted, with one extra arc t being attempted on each successive pass. This is ) worst case complexity, and with judicious re-use of previous calculations it seems a highly ecient procedure for nding the longest likely arc primitive within a curve. The procedure can then be applied recursively to the remaining segments of curve either side of the arc. 4 Geometrical Constraints The recognition

process described so far has been a data-driven or bottom-up process with the implicit top-down model that the binary image comprises stroke-like features that are a sequence of straight lines and circular arcs. In real drawings the vast majority of curves have additional constraints on the end points of the constituent strokes. The rst one is the obvious one of continuity: )= Next )) More interesting is the constraint of smoothness . This does not always apply. Many of the junctions between the arc and line primitives represent a corner . The angle between two line segments or between

a line segment and an arc tangent or between two arc segments can easily be found. A cornerness measure is dened as follows: Where and are the lengths of the primitives if they are a line, or 0 where is the radius of the arc, if they are arcs. This value was arrived at by experiment, but the results are not very sensitive to the value used. is the angle between the lines or tangents. If the value of is below a threshold, then it is assumed that the intention of the draughtsman was to produce a smooth line. This insight allows an adjustment to be made to the start and end points of the

primitives so as to make the curvature zero at this point. This improves the visual quality of the resulting vector image markedly. In fact the rst commercial application of the technique has been in sign-making. An artist can draw the outline of the sign to his satisfaction, and this program will turn a scanned image of the artwork into a vector representation which may then be machined using an NC router. The enforcement of the smoothness constraints resulted in an acceptable quality for most of the cases scanned. An example of the kind of result produced by this process is shown in

2 which is essentially indistinguishable from the original artwork.
Page 7
Figure 2: An Example Drawing where Line and Arc tting is Successful Enforcing the smoothness constraints required the identication of ve distinct cases, and some complex coordinate geometry which is described, together with corresponding Java code in [4]. These are as follows: 1. Find an arc that is tangential to a line and passes through two points (two solutions). 2. Find an arc that is tangential to two other arcs and passes through a point (two solutions). 3. Find and arc that is

tangential to two lines and passes through a point (four solutions). 4. Find an arc that is tangential to a line and to another arc and which passes though a point (two solutions). 5. Find a line that is tangential to two arcs and passes through a point (four solutions). In each case there are either two or four solutions to be considered. Imaginary solutions are discarded, and a simple test of proximity will identify which of the remaining solutions applies. 5 Further Constraints based on a Drawing Model The statistical properties of real drawings in comparison with the results of this arc

and line nding make it tempting to apply rules to further correct the curves generated. These rules may be wrong from time to time, causing an increase in the error of a curve. They result in an improvement many times more often than they result in a mistake, however, and will vi
Page 8
thus greatly improve the recognition performance of a ground-truthed document on most plausible metrics. The rules that have been applied are: 1. Near horizontal lines are horizontal 2. Near vertical lines are vertical 3. Lines near to 45 are at 45 4. Lines near to 30 are at 30 5. Lines near

to 60 are at 60 6. Arcs that are close to a full semi-circle are a semicircle 7. Arcs that are close to a quarter of a circle are a quarter of a circle 8. Cords of arcs that are close to a multiple of 45 are at that multiple of 45 9. lengths that are close to a whole multiple of (say) 5mm are a length that is the closest exact multiple of 5mm. The algorithm for applying these rules needed careful thought. Each rule means that one or other or both end points need to be moved. Unfortunately there are an innite number of possible changes that will satisfy an individual rule. What is

needed is a change that simultaneously satises each rule that is to be applied. Again there may be anything from zero to innity ways of doing this. If there are no ways in which all the rules can be satised then the rule that is least certain is dropped, and another attempt made. The best adjustment to be chosen from innity is the one that makes the smallest changes to the position of the end points. The sum of absolute distances is used for this metric. In some cases the application of these rules results in a perfect view, although this has only been achieved

for fairly simple cases so far. The approach used was to take an arbitrary start point and then form a description of the curve in terms of these corrected primitives. If the curve is closed, then the rules may be inconsistent if the nal end-point is not the same as the start-point. The average movement of all vertices can then me calculated, and each point adjusted by this value to produce a minimal adjustment. This algorithm is often successful, and it is impressive when a perfectly correct view is produced from a poor quality scanned image. However there are cases where the rules do

have a consistent solution which is not found by this method. Other methods, such as a gradient search of the state space are more powerful, but lose the simplicity and high eciency of this simple approach. It is probably worth using this as a rst pass in any case, and it would solve perhaps ninety percent of cases with minimal computation. vii
Page 9
Figure 3: A Complex Drawing which is not Processed Successfully 6 Discussion and Conclusions The methods described in this paper have been highly successful in producing line and arc primitives from scanned images such

as that shown in 2. The line and arc representation of the curve is pleasing to the eye and results in a greatly improved economy of represen- tation. Simple outline views of engineering components also produce gratifying results, and the application of the beautication rules above often give a perfect or near-perfect result. However, more complex scanned images such as that of 3 References [1] D. Dori and W. Liu. Sparse pixel vectorization, an algorithm and its performance evalu- ation. IEEE Transactions on Pattern Analysis and Machine Intelligence , 21(3):202..215, 1999. [2] Dov

Dori. Vector-based arc segmentation in the machine drawing understanding system environment. IEEE Transactions on Pattern Analysis and Machine Intelligence , 17(11), 1995. [3] Dave Elliman. A really useful vectorisation algorithm. GREC 1999, Jaipur, India September 1999. [4] Dave Elliman. The coordnate geometry of smooth sequences of lines and arcs. Technical Report 1002, University of Nottingham School of Computer Science, Jubilee Campus, Nottingham, NG8 1BB, UK, March 2001. viii
Page 10
[5] John Kittler John Illingworth. The adaptive hough transform. IEEE Transactions on Pattern

Analysis and Machine Intelligence , 9(5):690..697, 1987. [6] D.G. Lowe. Three dimensional object recognition from single two dimensional images. Articial Intelligence , 31:355..397, 1987. ix